Activin receptor-like kinases, proteins having serine threonine kinase domains and polynucleotides encoding same

ABSTRACT

A new receptor family has been identified, of activin-like kinases. Novel proteins have activin/TGF-β-type I receptor functionality, and have consequential diagnostic/therapeutic utility. They may have a serine/threonine kinase domain, a DFKSRN or DLKSKN sequence in subdomain VIB and/or a GTKRYM sequence in subdomain VIII.

THIS APPLICATION IS A 371 of PCT/GB93/02367, filed Nov. 17, 1993.

FIELD OF THE INVENTION

This invention relates to proteins having serine/threonine kinasedomains, corresponding nucleic acid molecules, and their use.

BACKGROUND OF THE INVENTION

The transforming growth factor-β (TGF-β) superfamily consists of afamily of structurally-related proteins, including three differentmammalian isoforms of TGF-β (TGF-β1, β2 and β3), activins, inhibins,müllerian-inhibiting substance and bone morphogenic proteins (BMPS) (forreviews see Roberts and Sporn, (1990) Peptide Growth Factors and TheirReceptors, Pt.1, Sporn and Roberts, eds. (Berlin: Springer-Verlag) pp419-472; Moses et al (1990) Cell 63, 245-247). The proteins of the TGF-βsuperfamily have a wide variety of biological activities. TGF-β acts asa growth inhibitor for many cell types and appears to play a centralrole in the regulation of embryonic development, tissue regeneration,immuno-regulation, as well as in fibrosis and carcinogenesis (Robertsand Sporn (199) see above).

Activins and inhibins were originally identified as factors whichregulate secretion of follicle-stimulating hormone secretion (Vale et al(1990) Peptide Growth Factors and Their Receptors, Pt.2, Sporn andRoberts, eds. (Berlin: Springer-Verlag) pp.211-248). Activins were alsoshown to induce the differentiation of haematopoietic progenitor cells(Murata et al (1988) Proc. Natl. Acad. Sci. USA II, 2434-2438; Eto et al(1987) Biochem. Biophys. Res. Commun. 142, 1095-1103) and inducemesoderm formation in Xenopus embryos (Smith et al (1990) Nature 3,729-731; van den Eijnden-Van Raaij et al (1990) Nature 345, 732-734).

BMPs or osteogenic proteins which induce the formation of bone andcartilage when implanted subcutaneously (Wozney et al (1988) Science242, 1528-1534), facilitate neuronal differentiation (Paralkar et al(1992) J. Cell Biol. 119, 1721-1728) and induce monocyte chemotaxis(Cunningham et al (1992) Proc. Natl. Acad. Sci. USA 89, 11740-11744).Müllerian-inhibiting substance induces regression of the Müllerian ductin the male reproductive system (Cate et al (1986) Cell 45, 685-698),and a glial cell line-derived neurotrophic factor enhances survival ofmidbrain dopaminergic neurons (Lin et al (1993) Science 260, 1130-1132).The action of these growth factors is mediated through binding tospecific cell surface receptors.

Within this family, TGF-β receptors have been most thoroughlycharacterized. By covalently cross-linking radio-labelled TGF-β to cellsurface molecules followed by polyacrylamide gel electrophoresis of theaffinity-labelled complexes, three distinct size classes of cell surfaceproteins (in most cases) have been identified, denoted receptor type I(53 kd), type II (75 kd), type III or betaglycan (a 300 kd proteoglycanwith a 120 kd core protein) (for a review see Massague (1992) Cell 691067-1070) and more recently endoglin (a homodimer of two 95 kdsubunits) (Cheifetz a& al (1992) J. Biol. Chem. 267 19027-19030).Current evidence suggests that type I and type II receptors are directlyinvolved in receptor signal transduction (Segarini et al (1989) Mol.Endo., 3, 261-272; Laiho et al (1991) J. Biol. Chem. 266, 9100-9112) andmay form a heteromeric complex; the type II receptor is needed for thebinding of TGF-β to the type I receptor and the type I receptor isneeded for the signal transduction induced by the type II receptor(Wrana et al (1992) Cell, 71, 1003-1004). The type III receptor andendoglin may have more indirect roles, possibly by facilitating thebinding of ligand to type II receptors (Wang et al (1991) Cell, 67797-805; López-Casillas et al (1993) Cell, 73 1435-1444).

Binding analyses with activin A and BMP4 have led to the identificationof two co-existing cross-linked affinity complexes of 50-60 kDa and70-80 kDa on responsive cells (Hino et al (1989) J. Biol. Chem. 264,10309-10314; Mathews and Vale (1991), Cell 68, 775-785; Paralker et al(1991) Proc. Natl. Acad. Sci. USA 87, 8913-8917). By analogy with TGF-βreceptors they are thought to be signalling receptors and have beennamed type I and type II receptors.

Among the type II receptors for the TGF-β superfamily of proteins, thecDNA for the activin type II receptor (Act RII) was the first to becloned (Mathews and Vale (1991) Cell 65, 973-982). The predictedstructure of the receptor was shown to be a transmembrane protein withan intracellular serine/threonine kinase domain. The activin receptor isrelated to the C. elegans daf-1 gene product, but the ligand iscurrently unknown (Georgi et al (1990) Cell 61, 635-645). Thereafter,another form of the activin type II receptor (activin type IIBreceptor), of which there are different splicing variants (Mathews et al(1992), Science 225, 1702-1705; Attisano et al (1992) Cell 68, 97-108),and the TGF-β type II receptor (TβRII) (Lin et al (1992) Cell 68,775-785) were cloned, both of which have putative serine/threoninekinase domains.

SUMMARY OF THE INVENTION

The present invention involves the discovery of related novel peptides,including peptides having the activity of those defined herein as SEQ IDNos. 2, 4, 8, 10, 12, 14, 16 and 18. Their discovery is based on therealisation that receptor serine/threonine kinases form a new receptorfamily, which may include the type II receptors for other proteins inthe TGF-β superfamily. To ascertain whether there were other members ofthis family of receptors, a protocol was designed to clone ActRII/daf Irelated cDNAs. This approach made use of the polymerase chain reaction(PCR), using degenerate primers based upon the amino-acid sequencesimilarity between kinase domains of the mouse activin type II receptorand daf-I gene products.

This strategy resulted in the isolation of a new family of receptorkinases called Activin receptor like kinases (ALK's) 1-6. These cDNAsshowed an overall 33-39% sequence similarity with ActRII and TGF-β typeII receptor and 40-92% sequence similarity towards each other in thekinase domains.

Soluble receptors according to the invention comprise at leastpredominantly the extracellular domain. These can be selected from theinformation provided herein, prepared in conventional manner, and usedin any manner associated with the invention.

Antibodies to the peptides described herein may be raised inconventional manner. By selecting unique sequences of the peptides,antibodies having desired specificity can be obtained.

The antibodies may be monoclonal, prepared in known manner. Inparticular, monoclonal antibodies to the extracellular domain are ofpotential value in therapy.

Products of the invention are useful in diagnostic methods, e.g. todetermine the presence in a sample for an analyte binding therewith,such as in an antagonist assay. Conventional techniques, e.g. anenzyme-linked immunosorbent assay, may be used.

Products of the invention having a specific receptor activity can beused in therapy, e.g. to modulate conditions associated with activin orTGF-β activity. Such conditions include fibrosis, e.g. liver cirrhosisand pulmonary fibrosis, cancer, rheumatoid arthritis andglomeronephritis.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the alignment of the serine/threonine (S/T) kinase domains(I-VIII) of related receptors from transmembrane proteins, includingembodiments of the present invention. The nomenclature of the subdomainsis accordingly to Hanks et al (1988).

FIGS. 2A to 2D shows the sequences and characteristics of the respectiveprimers used in the initial PCR reactions. The nucleic acid sequencesare also given as SEQ ID Nos. 19 to 22.

FIG. 3 is a comparison of the amino-acid sequences of human activin typeII receptor (Act R-II), mouse activin type IIB receptor (Act R-IIB),human TGF-β type II receptor (TβR-II), human TGF-β type I receptor(ALK-5), human activin receptor type IA (ALK-2), and type IB (ALK-4),ALKs 1 & 3 and mouse ALK-6.

FIG. 4 shows, schematically, the structures for Daf-1, Act R-II, ActR-IIB, TβR-II, TβR-I/ALK-5, ALK's-1, -2 (Act RIA), -3, -4 (Act RIB) &-6.

FIG. 5 shows the sequence alignment of the cysteine-rich domains of theALKs, TβR-II, Act R-II, Act R-IIB and daf-1 receptors.

FIG. 6 is a comparison of kinase domains of serine/threonine kinases,showing the percentage amino-acid identity of the kinase domains.

FIG. 7 shows the pairwise alignment relationship between the kinasedomains of the receptor serine/threonine kinases. The dendrogram wasgenerated using the Jotun-Hein alignment program (Hein (1990) Meth.Enzymol. 183, 626-645).

BRIEF DESCRIPTION OF THE SEQUENCE LISTINGS

Sequences 1 and 2 are the nucleotide and deduced amino-acid sequences ofcDNA for hALK-1 (clone HP57).

Sequences 3 and 4 are the nucleotide and deduced amino-acid sequences ofcDNA for hALK-2 (clone HP53).

Sequences 5 and 6 are the nucleotide and deduced amino-acid sequences ofcDNA for hALK-3 (clone ONF5).

Sequences 7 and 8 the nucleotide and deduced amino-acid sequences ofcDNA for hALK-4 (clone 11H8), complemented with PCR product encodingextracellular domain.

Sequences 9 and 10 are the nucleotide and deduced amino-acid sequencesof cDNA for hALK-5 (clone EMBLA).

Sequences 11 and 12 are the nucleotide and deduced amino-acid sequencesof cDNA for mALK-1 (clone AM6).

Sequences 13 and 14 are the nucleotide and deduced amino-acid sequencesof cDNA for mALK-3 (clones ME-7 and ME-D)

Sequences 15 and 16 are the nucleotide and deduced amino-acid sequencesof cDNA for mALK-4 (clone 8a1).

Sequences 17 and 18 are the nucleotide and deduced amino-acid sequencesof cDNA for mALK-6 (clone ME-6).

Sequence 19 (B1-S) is a sense primer, extracellular domain,cysteine-rich region, BamHI site at 5′ end, 28-mer, 64-fold degeneracy.

Sequence 20 (B3-S) is a sense primer, kinase domain II, BamHI site at 5′end, 25-mer, 162-fold degeneracy.

Sequence 21 (B7-S) is a sense primer, kinase domain VIB, S/T kinasespecific residues, BamHI site at 5′ end, 24-mer, 288-fold degeneracy.

Sequence 22 (ES-AS) is an anti-sense primer, kinase domain, S/Tkinase-specific residues EcoRI site at 5′ end, 20-mer, 18-folddegeneracy.

Sequence 23 is an oligonucleotide probe.

Sequence 24 is a 5′ primer.

Sequence 25 is a 3′ primer.

Sequence 26 is a consensus sequence in Subdomain I.

Sequences 27 and 28 are novel sequence motifs in Subdomain VIB.

Sequence 29 is a novel sequence motif in Subdomain VIII.

DESCRIPTION OF THE INVENTION

As described in more detail below, nucleic acid sequences have beenisolated, coding for a new sub-family of serine/threonine receptorkinases. The term nucleic acid molecules as used herein refers to anysequence which codes for the murine, human or mammalian form, amino-acidsequences of which are presented herein. It is understood that the wellknown phenomenon of codon degeneracy provides for a great deal ofsequence variation and all such varieties are included within the scopeof this invention.

The nucleic acid sequences described herein may be used to clone therespective genomic DNA sequences in order to study the genes' structureand regulation. The murine and human cDNA or genomic sequences can alsobe used to isolate the homologous genes from other mammalian species.The mammalian DNA sequences can be used to study the receptors'functions in various in vitro and in vivo model systems.

As exemplified below for ALK-5 cDNA, it is also recognised that, giventhe sequence information provided herein, the artisan could easilycombine the molecules with a pertinent promoter in a vector, so as toproduce a cloning vehicle for expression of the molecule. The promoterand coding molecule must be operably linked via any of thewell-recognized and easily-practised methodologies for so doing. Theresulting vectors, as well as the isolated nucleic acid moleculesthemselves, may be used to transform prokaryotic cells (e.g. E. coli),or transfect eukaryotes such as yeast (S. cerevisiae), PAE, COS or CHOcell lines. Other appropriate expression systems will also be apparentto the skilled artisan.

Several methods may be used to isolate the ligands for the ALKs. Asshown for ALK-5 cDNA, cDNA clones encoding the active open readingframes can be subcloned into expression vectors and transfected intoeukaryotic cells, for example COS cells. The transfected cells which canexpress the receptor can be subjected to binding assays forradioactively-labelled members of the TGF-β superfamily (TGF-β,activins, inhibins, bone morphogenic proteins and müllerian-inhibitingsubstances), as it may be expected that the receptors will bind membersof the TGF-β superfamily. Various biochemical or cell-based assays canbe designed to identify the ligands, in tissue extracts or conditionedmedia, for receptors in which a ligand is not known. Antibodies raisedto the receptors may also be used to identify the ligands, using theimmunoprecipitation of the cross-linked complexes. Alternatively,purified receptor could be used to isolate the ligands using anaffinity-based approach. The determination of the expression patterns ofthe receptors may also aid in the isolation of the ligand. These studiesmay be carried out using ALK DNA or RNA sequences as probes to performin situ hybridisation studies.

The use of various model systems or structural studies should enable therational development of specific agonists and antagonists useful inregulating receptor function. It may be envisaged that these can bepeptides, mutated ligands, antibodies or other molecules able tointeract with the receptors.

The foregoing provides examples of the invention Applicants intend toclaim which includes, inter alia, isolated nucleic acid molecules codingfor activin receptor-like kinases (ALKs), as defined herein. Theseinclude such sequences isolated from mammalian species such as mouse,human, rat, rabbit and monkey.

The following description relates to specific embodiments. It will beunderstood that the specification and examples are illustrative but notlimitative of the present invention and that other embodiments withinthe spirit and scope of the invention will suggest themselves to thoseskilled in the art.

Preparation of mRNA and Construction of a cDNA Library

For construction of a cDNA library, poly (A)⁺ RNA was isolated from ahuman erythroleukemia cell line (HEL 92.1.7) obtained from the AmericanType Culture Collection (ATCC TIB 180). These cells were chosen as theyhave been shown to respond to both activin and TGF-β. Moreover leukaemiccells have proved to be rich sources for the cloning of novel receptortyrosine kinases (Partanen et al (1990) Proc. Natl. Acad. Sci. USA 87,8913-8917 and (1992) Mol. Cell. Biol. 12, 1698-1707). (Total) RNA wasprepared by the guanidinium isothiocyanate method (Chirgwin et al (1979)Biochemistry 18, 5294-5299). mRNA was selected using the poly-A or polyAT tract mRNA isolation kit (Promega, Madison, Wis., U.S.A.) asdescribed by the manufacturers, or purified through an oligo(dT)-cellulose column as described by Aviv and Leder (1972) Proc. Natl.Acad. Sci. USA 69, 1408-1412. The isolated mRNA was used for thesynthesis of random primed (Amersham) cDNA, that was used to make aλgt10 library with 1×10⁵ independent cDNA clones using the RiboclonecDNA synthesis system (Promega) and λgt10 in vitro packaging kit(Amersham) according to the manufacturers' procedures. An amplifiedoligo (dT) primed human placenta λZAPII cDNA library of 5×10⁵independent clones was used. Poly (A)⁺ RNA isolated from AG1518 humanforeskin fibroblasts was used to prepare a primary random primed λZAPIIcDNA library of 1.5×10⁶ independent clones using the RiboClone cDNAsynthesis system and Gigapack Gold II packaging extract (Stratagene). Inaddition, a primary oligo (dT) primed human foreskin fibroblast λgt10cDNA library (Claesson-Welsh et al (1989) Proc. Natl. Acad. Sci. USA. 864917-4912) was prepared. An amplified oligo (dT) primed HEL cell λgt11cDNA library of 1.5×10⁶ independent clones (Poncz et al (1987) Blood 69219-223) was used. A twelve-day mouse embryo λEXIox cDNA library wasobtained from Novagen (Madison, Wis., U.S.A.); a mouse placenta λZAPIIcDNA library was also used.

Generation of cDNA Probes by PCR

For the generation of cDNA probes by PCR (Lee et al (1988) Science 239,1288-1291) degenerate PCR primers were constructed based upon theamino-acid sequence similarity between the mouse activin type IIreceptor (Mathews and Vale (1991) Cell 65, 973-982) and daf-i (George etal (1990) Cell 61, 635-645) in the kinase domains II and VIII. FIG. 1shows the aligned serine/threonine kinase domains (I-VIII), of fourrelated receptors of the TGF-β superfamily, i.e. hTβR-II, mActR-IIB,mActR-II and the daf-1 gene product, using the nomenclature of thesubdomains according to Hanks et al (1988) Science 241, 45-52.

Several considerations were applied in the design of the PCR primers.The sequences were taken from regions of homology between the activintype II receptor and the daf-1 gene product, with particular emphasis onresidues that confer serine/threonine specificity (see Table 2) and onresidues that are shared by transmembrane kinase proteins and not bycytoplasmic kinases. The primers were designed so that each primer of aPCR set had an approximately similar GC composition, and so that selfcomplementarity and complementarity between the 3′ ends of the primersets were avoided. Degeneracy of the primers was kept as low aspossible, in particular avoiding serine, leucine and arginine residues(6 possible codons), and human codon preference was applied. Degeneracywas particularly avoided at the 3′ end as, unlike the 5′ end, wheremismatches are tolerated, mismatches at the 3′ end dramatically reducethe efficiency of PCR.

In order to facilitate directional subcloning, restriction enzyme siteswere included at the 5′ end of the primers, with a GC clamp, whichpermits efficient restriction enzyme digestion. The primers utilised areshown in FIG. 2. Oligonucleotides were synthesized using Gene assemblerplus (Pharmacia-LKB) according to the manufacturers instructions.

The mRNA prepared from HEL cells as described above wasreverse-transcribed into cDNA in the presence of 50 EM Tris-HCl, pH 8.3,8 mM MgCl₂, 30 mM KCl, 10 mM dithiothreitol, 2 mM nucleotidetriphosphates, excess oligo (dT) primers and 34 units of AMV reversetranscriptase at 42° C. for 2 hours in 40 μl of reaction volume.Amplification by PCR was carried out with a 7.5% aliquot (3 μl) of thereverse-transcribed mRNA, in the presence of 10 mM Tris-HCl, pH 8.3, 50mM KCl, 1.5 M MgCl₂, 0.01% gelatin, 0.2 mM nucleotide triphosphates, 1μM of both sense and antisense primers and 2.5 units of Tag polymerase(Perkin Elmer Cetus) in 100 μl reaction volume. Amplifications wereperformed on a thermal cycler (Perkin Elmer Cetus) using the followingprogram: first 5 thermal cycles with denaturation for 1 minute at 94°C., annealing for 1 minute at 50° C., a 2 minute ramp to 55° C. andelongation for 1 minute at 72° C., followed by 20 cycles of 1 minute at94° C., 30 seconds at 55° C. and 1 minute at 72° C. A second round ofPCR was performed with 3 μl of the first reaction as a template. Thisinvolved 25 thermal cycles, each composed of 94° C. (1 min), 55° C. (0.5min), 72° C. (1 min).

General procedures such as purification of nucleic acids, restrictionenzyme digestion, gel electrophoresis, transfer of nucleic acid to solidsupports and subcloning were performed essentially according toestablished procedures as described by Sambrook et al, (1989), Molecularcloning: A Laboratory Manual, 2^(nd) Ed. Cold Spring Harbor Laboratory(Cold Spring Harbor, N.Y., USA).

Samples of the PCR products were digested with BamHI and EcoRI andsubsequently fractionated by low melting point agarose gelelectrophoresis. Bands corresponding to the approximate expected sizes,(see Table 1: ≈460 bp for primer pair B3-S and E8-AS and ≈140 bp forprimer pair B7-S and E8-AS) were excised from the gel and the DNA waspurified. Subsequently, these fragments were ligated into pUC19(Yanisch-Perron et al (1985) Gene 33, 103-119), which had beenpreviously linearised with BamHI and EcoR1 and transformed into E. colistrain DH5α using standard protocols (Sambrook et al, supra). Individualclones were sequenced using standard double-stranded sequencingtechniques and the dideoxynucleotide chain termination method asdescribed by Sanger et al (1977) Proc. Natl. Acad. Sci. USA 74,5463-5467, and T7 DNA polymerase.

Employing Reverse Transcriptase PCR on HEL mRNA with the primer pairB3-S and E8-AS, three PCR products were obtained, termed 11.1, 11.2 and11.3, that corresponded to novel genes. Using the primer pair B7-S andE8-AS, an additional novel PCR product was obtained termed 5.2.

TABLE 1 SIZE OF DNA SEQUENCE SEQUENCE FRAG- IDENTITY IDENTITY NAME MENTIN WITH BETWEEN OF IN- mActRII/ SEQUENCE mActRII PCR SERT hTBRIImActRII/ and PRO- SIZE CLONES hTBRII TBR-II DUCT PRIMERS (bp) (bp) (%)(%) 11.1 B3-S/E8-AS 460 460  46/40 42 11.2 B3-S/E8-AS 460 460  49/44 4711.3 B3-S/E8-AS 460 460  44/36 48 11.29 B3-S/E8-AS 460 460 ND/100 ND 9.2 B1-S/E8-AS 800 795 100/ND ND  5.2 B7-S/E8-AS 140 143  40/38 60

Isolation of cDNA Clones

The PCR products obtained were used to screen various cDNA librariesdescribed supra. Labelling of the inserts of PCR products was performedusing random priming method (Feinberg and Vogelstein (1983) Anal.Biochem, 132 6-13) using the Megaprime DNA labelling system (Amersham).The oligonucleotide derived from the sequence of the PCR product 5.2 waslabelled by phosphorylation with T4 polynucleotide kinase followingstandard protocols (Sambrook et al, supra). Hybridization andpurification of positive bacteriophages were performed using standardmolecular biological techniques.

The double-stranded DNA clones were all sequenced using thedideoxynucleotide chain-termination method as described by Sanger et al,supra, using T7 DNA polymerase (Pharmacia-LKB) or Sequenase (U.S.Biochemical Corporation, Cleveland, Ohio, U.S.A.). Compressions ofnucleotides were resolved using 7-deaza-GTP (U.S. Biochemical Corp.) DNAsequences were analyzed using the DNA STAR computer program (DNA STARLtd. U.K.). Analyses of the sequences obtained revealed the existence ofsix distinct putative receptor serine/threonine kinases which have beennamed ALK 1-6.

To clone cDNA for ALK-1 the oligo (dT) primed human placenta cDNAlibrary was screened with a radiolabelled insert derived from the PCRproduct 11.3; based upon their restriction enzyme digestion patterns,three different types of clones with approximate insert sizes. of 1.7kb, 2 kb & 3.5 kb were identified. The 2 kb clone, named HP57, waschosen as representative of this class and subjected to completesequencing. Sequence analysis of ALK-1 revealed a sequence of 1984nucleotides including a poly-A tail (SEQ ID No. 1). The longest openreading frame encodes a protein of 503 amino-acids, with high sequencesimilarity to receptor serine/threonine kinases (see below). The firstmethionine codon, the putative translation start site, is at nucleotide283-285 and is preceded by an in-frame stop codon. This first ATG is ina more favourable context for translation initiation (Kozak (1987) Nucl.Acids Res., 15, 8125-8148) than the second and third in-frame ATG atnucleotides 316-318 and 325-327. The putative initiation codon ispreceded by a 5′ untranslated sequence of 282 nucleotides that isGC-rich (80% GC), which is not uncommon for growth factor receptors(Kozak (1991) J. Cell Biol., 115, 887-903). The 3′ untranslated sequencecomprises 193 nucleotides and ends with a poly-A tail. No bona fidepoly-A addition signal is found, but there is a sequence (AATACA), 17-22nucleotides upstream of the poly-A tail, which may serve as a poly-Aaddition signal.

ALK-2 cDNA was cloned by screening an amplified oligo (dT) primed humanplacenta cDNA library with a radiolabelled insert derived from the PCRproduct 11.2. Two clones, termed HP53 and HP64, with insert sizes of 2.7kb and 2.4 kb respectively, were identified and their sequences weredetermined. No sequence difference in the overlapping clones was found,suggesting they are both derived from transcripts of the same gene.

Sequence analysis of cDNA clone HP53 (SEQ ID No. 3) revealed a sequenceof 2719 nucleotides with a poly-A tail. The longest open reading frameencodes a protein of 509 amino-acids. The first ATG at nucleotides104-106 agrees favourably with Kozak's consensus sequence with an A atposition 3. This ATG is preceded in-frame by a stop codon. There arefour ATG codons in close proximity further downstream, which agree withthe Kozak's consensus sequence (Kozak, supra), but according to Kozak'sscanning model the first ATG is predicted to be the translation startsite. The 5′ untranslated sequence is 103 nucleotides. The 3′untranslated sequence of 1089 nucleotides contains a polyadenylationsignal located 9-14 nucleotides upstream from the poly-A tail. The cDNAclone HP64 lacks 498 nucleotides from the 5′ end compared to HP53, butthe sequence extended at the 3′ end with 190 nucleotides and poly-A tailis absent. This suggests that different polyadenylation sites occur forALK-2. In Northern blots, however, only one transcript was detected (seebelow).

The cDNA for human ALK-3 was cloned by initially screening an oligo (dT)primed human foreskin fibroblast cDNA library with an oligonucleotide(SEQ ID No. 23) derived from the PCR product 5.2. One positive cDNAclone with an insert size of 3 kb, termed ON11, was identified. However,upon partial sequencing, it appeared that this clone was incomplete; itencodes only part of the kinase domain and lacks the extracelluardomain. The most 5′ sequence of ON11, a 540 nucleotide XbaI restrictionfragment encoding a truncated kinase domain, was subsequently used toprobe a random primed fibroblast cDNA library from which one cDNA clonewith an insert size of 3 kb, termed ONF5, was isolated (SEQ ID No. 5).Sequence analysis of ONF5 revealed a sequence of 2932 nucleotideswithout a poly-A tail, suggesting that this clone was derived byinternal priming. The longest open reading frame codes for a protein of532 amino-acids. The first ATG codon which is compatible with Kozak'sconsensus sequence (Kozak, supra), is at 310-312 nucleotides and ispreceded by an in-frame stop codon. The 5′ and 3′ untranslated sequencesare 309 and 1027 nucleotides long, respectively.

AIR-4 cDNA was identified by screening a human oligo (dT) primed humanerythroleukemia cDNA library with the radiolabelled insert of the PCRproduct 11.1 as a probe. One cDNA clone, termed 11H8, was identifiedwith an insert size of 2 kb (SEQ ID No. 7). An open reading frame wasfound encoding a protein sequence of 383 amino-acids encoding atruncated extracellular domain with high similarity to receptorserine/threonine kinases. The 3′ untranslated sequence is 818nucleotides and does not contain a poly-A tail, suggesting that the cDNAwas internally primed. cDNA encoding the complete extracellular domain(nucleotides 1-366) was obtained from HEL cells by RT-PCR with 5′ primer(SEQ ID No. 24) derived in part from sequence at translation start siteof SKR-2 (a cDNA sequence deposited in GenBank data base, accesionnumber L10125, that is identical in part to ALK-4) and 3′ primer (SEQ IDNo. 25) derived from 11H8 cDNA clone.

ALK-5 was identified by screening the random primed HEL cell λgt 10 cDNAlibrary with the PCR product 11.1 as a probe. This yielded one positiveclone termed EMBLA (insert size of 5.3 kb with 2 internal EcoRI sites).Nucleotide sequencing revealed an open reading frame of 1509 bp, codingfor 503 amino-acids. The open reading frame was flanked by a 5′untranslated sequence of 76 bp, and a 3′ untranslated sequence of 3.7 kbwhich was not completely sequenced. The nucleotide and deducedamino-acid sequences of ALK-5 are shown in SEQ ID Nos. 9 and 10. In the5′ part of the open reading frame, only one ATG codon was found; thiscodon fulfils the rules of translation initiation (Kozak, supra). Anin-frame stop codon was found at nucleotides (−54)-(−52) in the 5′untranslated region. The predicted ATG start codon is followed by astretch of hydrophobic amino-acid residues which has characteristics ofa cleavable signal sequence. Therefore, the first ATG codon is likely tobe used as a translation initiation site. A preferred cleavage site forthe signal peptidase, according to von Heijne (1986) Nucl. Acid. Res.14, 4683-4690, is located between amino-acid residues 24 and 25. Thecalculated molecular mass of the primary translated product of the ALK-5without signal sequence is 53,646 Da.

Screening of the mouse embryo λEX Iox cDNA library using PCR, product11.1 as a probe yielded 20 positive clones. DNAs from the positiveclones obtained from this library were digested with EcoRI and HindIII,electrophoretically separated on a 1.3% agarose gel and transferred tonitrocellulose filters according to established procedures as describedby Sambrook et al, supra. The filters were then hybridized with specificprobes for human ALK-1 (nucleotide 288-670), AIR-2 (nucleotide 1-581),ALK-3 (nucleotide 79-824) or ALK-4 nucleotide 1178-1967). Such analysesrevealed that a clone termed ME-7 hybridised with the human ALK-3 probe.However, nucleotide sequencing revealed that this clone was incomplete,and lacked the 5′ part of the translated region. Screening the same cDNAlibrary with a probe corresponding to the extracelluar domain of humanALK-3 (nucleotides 79-824) revealed the clone ME-D. This clone wasisolated and the sequence was analyzed. Although this clone wasincomplete in the 3′ end of the translated region, ME-7 and ME-Doverlapped and together covered the complete sequence of mouse ALK-3.The predicted amino-acid sequence of mouse ALK-3 is very similar to thehuman sequence; only 8 amino-acid residues differ (98% identity; see SEQID No. 14) and the calculated molecular mass of the primary translatedproduct without the putative signal sequence is 57,447 Da.

Of the clones obtained from the initial library screening with PCRproduct 11.1, four clones hybridized to the probe corresponding to theconserved kinase domain of ALK-4 but not to probes from more divergentparts of ALK-1 to -4. Analysis of these clones revealed that they havean identical sequence which differs from those of ALK-1 to -5 and wastermed ALK-6. The longest clone ME6 with a 2.0 kb insert was completelysequenced yielding a 1952 bp fragment consisting of an open readingframe of 1506 bp (502 amino-acids), flanked by a 5′ untranslatedsequence of 186 bp, and a 3′ untranslated sequence of 160 bp. Thenucleotide and predicted amino-acid sequences of mouse ALK-6 are shownin SEQ ID Nos. 17 and 18. No polyadenylation signal was found in the 3′untranslated region of ME6, indicating that the cDNA was internallyprimed in the 3′ end. Only one ATG codon was found in the 5′ part of theopen reading frame, which fulfils the rules for translation initiation(Kozak, supra), and was preceded by an in-frame stop codon atnucleotides 163-165. However, a typical hydrophobic leader sequence wasnot observed at the N terminus of the translated region. Since there isno ATG codon and putative hydrophobic leader sequence, this ATG codon islikely to be used as a translation initiation site. The calculatedmolecular mass of the primary translated product with the putativesignal sequence is 55,576 Da.

Mouse ALK-1 (clone AM6 with 1.9 kb insert) was obtained from the mouseplacenta λZAPII cDNA library using human ALK-1 cDNA as a probe (see SEQID No. 11). Mouse ALK-4 (clone 8a1 with 2.3 kb insert) was also obtainedfrom this library using human ALK-4 cDNA library as a probe (SEQ ID No.15).

To summarise, clones HP22, HP57, ONF1, ONF3, ONF4 and HP29 encode thesame gene, ALK-1. Clone AM6 encodes mouse ALK-1. HP53, HP64 and HP84encode the same gene, ALK-2. ONF5, ONF2 and ON11 encode the same geneALK-3. ME-7 and ME-D encode the mouse counterpart of human ALK-3. 11H8encodes a different gene ALK-4, whilst 8a1 encodes the mouse equivalent.EMBLA encodes ALK-5, and ME-6 encodes ALK-6.

The sequence alignment between the 6 ALK genes and TβR-II, mActR-II andActR-IIB is shown in FIG. 3. These molecules have a similar domainstructure; an N-terminal predicted hydrophobic signal sequence (vonHeijne (1986) Nucl. Acids Res. 14: 4683-4690) is followed by arelatively small extracellular cysteine-rich ligand binding domain, asingle hydrophobic transmembrane region (Kyte & Doolittle (1982) J. Mol.Biol. 157, 105-132) and a C-terminal intracellular portion, whichconsists almost entirely of a kinase domain (FIGS. 3 and 4).

The extracelluar domains of these receptors have cysteine-rich regions,but they show little sequence similarity; for example, less than 20%sequence identity is found between Daf-1, ActR-II, TβR-II and ALK-5. TheALKs appear to form a subfamily as they show higher sequencesimilarities (15-47% identity) in their extracellular domains. Theextracellular domains of ALK-5 and ALK-4 have about 29% sequenceidentity. In addition, ALK-3 and ALK-6 share a high degree of sequencesimilarity in their extracellular domains (46% identity).

The positions of many of the cysteine residues in all receptors can bealigned, suggesting that the extracellular domains may adopt a similarstructural configuration. See FIG. 5 for ALKs-1,-2,-3 & -5. Each of theALKs (except ALK-6) has a potential N-linked glycosylation site, theposition of which is conserved between ALK-1 and ALK-2, and betweenALK-3, ALK-4 and ALK-5 (see FIG. 4).

The sequence similarities in the kinase domains between daf-1, ActR-II,TβR-II and ALK-5 are approximately 40%, whereas the sequence similaritybetween the ALKs 1 to 6 is higher (between 59% and 90%; see FIG. 6).Pairwise comparison using the Jutun-Hein sequence alignment program(Hein (1990) Meth, Enzymol., 183, 626-645), between all family members,identifies the ALKs as a separate subclass among serine/threoninekinases (FIG. 7).

The catalytic domains of kinases can be divided into 12 subdomains withstretches of conserved amino-acid residues. The key motifs are found inserine/threonine kinase receptors suggesting that they are functionalkinases. The consensus sequence for the binding of ATP(Gly-X-Gly-X-X-Gly in subdomain I followed by a Lys residue furtherdownstream in subdomain II) is found in all the ALKs.

The kinase domains of daf-1, ActR-II, and ALKs show approximately equalsequence similarity with tyrosine and serine/threonine protein kinases.However analysis of the amino-acid sequences in subdomains VI and VIII,which are the most useful to distinguish a specificity forphosphorylation of tyrosine residues versus serine/threonine residues(Hanks et al (1988) Science 241 42-52) indicates that these kinases areserine/threonine kinases; refer to Table 2.

TABLE 2 SUBDOMAINS KINASE VIB VIII Serine/threonine kinase consensusDLKPEN G (T/S) XX (Y/F) X Tyrosine kinase consensus DLAARN XP (I/V)(K/R) W (T/M) Act R-II DIKSKN GTRRYM Act R-IIB DFKSKN GTRRYM TβR-IIDLKSSN GTARYM ALK-I DFKSRN GTKRYM ALK −2, −3, −4, −5, & −6 DLKSKN GTKRYM

The sequence motifs DLRSKN (Subdomain VIB) and GTKRYM (Subdomain VIII),that are found in most of the serine/threonine kinase receptors, agreewell with the consensus sequences for all protein serine/threoninekinase receptors in these regions. In addition, these receptors, exceptfor ALK-1, do not have a tyrosine residue surrounded by acidic residuesbetween subdomains VII and VIII, which is common for tyrosine kinases. Aunique characteristic of the members of the ALK serine/threonine kinasereceptor family is the presence of two short inserts in the kinasedomain between subdomains VIA and VIB and between subdomains X and XI.In the intracellular domain, these regions, together with thejuxtamembrane part and C-terminal tail, are the most divergent betweenfamily members (see FIGS. 3 and 4). Based on the sequence similaritywith the type II receptors for TGF-β and activin, the C termini of thekinase domains of ALKs-1 to -6 are set at Ser-495, Ser-501, Ser-527,Gln-500, Gln-498 and Ser-497, respectively.

mRNA Expression

The distribution of ALK-1, -2, -3, -4 was determined by Northern blotanalysis. A Northern blot filter with mRNAs from different human tissueswas obtained from Clontech (Palo Alto, Calif.). The filters werehybridized with ³²P-labelled probes at 42° C. overnight in 50%formaldehyde, 5×standard saline citrate (SSC; 1×SSC is 50 mM sodiumcitrate, pH 7.0, 150 mM NaCl), 0.1% SDS, 50 mM sodium phosphate,5×Denhardt's solution and 0.1 mg/ml salmon sperm DNA. In order tominimize cross-hybridization, probes were used that did not encode partof the kinase domains, but corresponded to the highly diverged sequencesof either 5′ untranslated and ligand-binding regions (probes for ALK-1,-2 and -3) or 3′ untranslated sequences (probe for ALK-4). The probeswere labelled by random priming using the Multiprime (or Mega-prime) DNAlabelling system and [α-³²P] dCTP (Feinberg & Vogelstein (1983) Anal.Biochem. 132: 6-13). Unincorporated label was removed by Sephadex G-25chromatography. Filters were washed at 65° C., twice for 30 minutes in2.5×SSC, 0.1% SDS and twice for 30 minutes in 0.3×SSC, 0.1% SDS beforebeing exposed to X-ray film. Stripping of blots was performed byincubation at 90-100° C. in water for 20 minutes.

The ALK-5 mRNA size and distribution were determined by Northern blotanalysis as above. An EcoR1 fragment of 980 bp of the full length ALK-5cDNA clone, corresponding to the C-terminal part of the kinase domainand 3′ untranslated region (nucleotides 1259-2232 in SEQ ID No. 9) wasused as a probe. The filter was washed twice in 0.5×SSC, 0.1% SDS at 55°C. for 15 minutes.

Using the probe for ALK-1, two transcripts of 2.2 and 4.9 kb weredetected. The ALK-1 expression level varied strongly between differenttissues, high in placenta and lung, moderate in heart, muscle andkidney, and low (to not detectable) in brain, liver and pancreas. Therelative ratios between the two transcripts were similar in mosttissues; in kidney, however, there was relatively more of the 4.9 kbtranscript. By reprobing the blot with a probe for ALK-2, one transcriptof 4.0 kb was detected with a ubiquitous expression pattern. Expressionwas detected in every tissue investigated and was highest in placentaand skeletal muscle. Subsequently the blot was reprobed for ALK-3. Onemajor transcript of 4.4 kb and a minor transcript of 7.9 kb weredetected. Expression was high in skeletal muscle, in which also anadditional minor transcript of 10 kb was observed. Moderate levels ofALK-3 mRNA were detected in heart, placenta, kidney and pancreas, andlow (to not detectable) expression was found in brain, lung and liver.The relative ratios between the different transcripts were similar inthe tested tissues, the 4.4 kb transcript being the predominant one,with the exception for brain where both transcripts were expressed at asimilar level. Probing the blot with ALK-4 indicated the presence of atranscript with the estimated size of 5.2 kb and revealed an ubiquitousexpression pattern. The results of Northern blot analysis using theprobe for ALK-5 showed that a 5.5 kb transcript is expressed in allhuman tissues tested, being most abundant in placenta and least abundantin brain and heart.

The distribution of mRNA for mouse ALK-3 and -6 in various mouse tissueswas also determined by Northern blot analysis. A multiple mouse tissueblot was obtained from Clontech, Palo Alto, Calif., U.S.A. The filterwas hybridized as described above with probes for mouse ALK-3 and ALK-6.The EcoRI-PstI restriction fragment, corresponding to nucleotides79-1100 of ALK-3, and the SacI-HpaI fragment, corresponding tonucleotides 57-720 of ALK-6, were used as probes. The filter was washedat 65° C. twice for 30 minutes in 2.5×SSC, 0.1% SDS and twice for 30minutes with 0.3×SSC, 0.1% SDS and then subjected to autoradiography.

Using the probe for mouse ALK-3, a 1.1 kb transcript was found only inspleen. By reprobing the blot with the ALK-6 specific probe, atranscript of 7.2 kb was found in brain and a weak signal was also seenin lung. No other signal was seen in the other tissues tested, i.e.heart, liver, skeletal muscle, kidney and testis.

All detected transcript sizes were different, and thus no cross-reactionbetween mRNAs for the different ALKs was observed when the specificprobes were used. This suggests that the multiple transcripts of ALK-1and ALK-3 are coded from the same gene. The mechanism for generation ofthe different transcripts is unknown at present; they may be formed byalternative mRNA splicing, differential polyadenylation, use ofdifferent promoters, or by a combination of these events. Differences inmRNA splicing in the regions coding for the extracellular domains maylead to the synthesis of receptors with different affinities forligands, as was shown for mActR-IIB (Attisano et al (1992) Cell 68,97-108) or to the production of soluble binding protein.

The above experiments describe the isolation of nucleic acid sequencescoding for new family of human receptor kinases. The cDNA for ALK-5 wasthen used to determine the encoded protein size and binding properties.

Properties of the ALKs cDNA Encoded Proteins

To study the properties of the proteins encoded by the different ALKcDNAs, the cDNA for each ALK was subcloned into a eukaryotic expressionvector and transfected into various cell types and then subjected toimmunoprecipitation using a rabbit antiserum raised against a syntheticpeptide corresponding to part of the intracellular juxtamembrane region.This region is divergent in sequence between the variousserine/threonine kinase receptors. The following amino-acid residueswere used:

ALK-1 145-166 ALK-2 151-172 ALK-3 181-202 ALK-4 153-171 ALK-5 158-179ALK-6 151-168

The rabbit antiserum against ALK-5 was designated VPN.

The peptides were synthesized with an Applied Biosystems 430A PeptideSynthesizer using t-butoxycarbonyl chemistry and purified byreversed-phase high performance liquid chromatography. The peptides werecoupled to keyhole limpet haemocyanin (Calbiochem-Behring) usingglutaraldehyde, as described by Guillick et al (1985) EMBO J. 4,2869-2877. The coupled peptides were mixed with Freunds adjuvant andused to immunize rabbits.

Transient Transfection of the ALK-5 cDNA

COS-1 cells (American Type Culture Collection) and the R mutant of Mv1Lucells (for references, see below) were cultured in Dulbecco's modifiedEagle's medium containing 10% fetal bovine serum (FBS) and 100 units/mlpenicillin and 50 μg 1 ml streptomycin in 5% CO₂ atmosphere at 37° C.The ALK-5 cDNA (nucleotides (−76)-2232), which includes the completecoding region, was cloned in the pSV7d vector (Truett et al, (1985) DNA4, 333-349), and used for transfection. Transfection into COS-1 cellswas performed by the calcium phosphate precipitation method (Wigler etal (1979) Cell 16, 777-785). Briefly, cells were seeded into 6-well cellculture plates at a density of 5×10⁵ cells/well, and transfected thefollowing day with 10 μg of recombinant plasmid. After overnightincubation, cells were washed three times with a buffer containing 25 mMTris-HCl, pH 7.4, 138 mM NaCl, 5 mM KCl, 0.7 mM CaCl₂, 0.5 mm MgCl₂ and0.6 mM Na₂HPO₄, and then incubated with Dulbecco's modified Eagle'smedium containing FBS and antibiotics. Two days after transfection, thecells were metabolically labelled by incubating the calls for 6 hours inmethionine and cysteine-free MCDB 104 medium with 150 μCi/ml of[³⁵S]-methionine and [³⁵S]-cysteine (in vivo labelling mix; Amersham).After labelling, the cells were washed with 150 mM NaCl, 25 mM Tris-HCl,pH 7.4, and then solubilized with a buffer containing 20 mM Tris-HCl, pH7.4, 150 mM NaCl, 10 mM EDTA, 1% Triton X-100, 1% deoxycholate, 1.5%Trasylol (Bayer) and 1 mM phenylmethylsulfonylfluoride (PMSF; Sigma).After 15 minutes on ice, the cell lysates were pelleted bycentrifugation, and the supernatants were then incubated with 7 μl ofpreimmune serum for 1.5 hours at 4° C. Samples were then given 50 μl ofprotein A-Sepharose (Pharmacia-LKB) slurry (50% packed beads in 150 mMNaCl, 20 mM Tris-HCl, pH 7.4, 0.2% Triton X100) and incubated for 45minutes at 4° C. The beads were spun down by centrifugation, and thesupernatants (1 ml) were then incubated with either 7 al of preimmuneserum or the VPN antiserum for 1.5 hours at 4° C. For blocking, 10 μg ofpeptide was added together with the antiserum. Immune complexes werethen given 50 μl of protein A-Sepharose (Pharmacia-LKB) slurry (50%packed beads in 150 mM NaCl, 20 mM Tris-HCl, pH 7.4, 0.2% Triton X-100)and incubated for 45 minutes at 4° C. The beads were spun down andwashed four times with a washing buffer (20 EM Tris-HCl, pH 7.4, 500 mMNaCl, 1% Triton X-100, 1% deoxycholate and 0.2% SDS), followed by onewash in distilled water. The immune complexes were eluted by boiling for5 minutes in the SDS-sample buffer (100 MM Tris-HCl, pH 8.8, 0.01%bromophenol blue, 36% glycerol, 4% SDS) in the presence of 10 mM DTT,and analyzed by SDS-gel electrophoresis using 7-15% polyacrylamide gels(Blobel and Dobberstein, (1975) J.Cell Biol. 67, 835-851). Gels werefixed, incubated with Amplify (Amersham) for 20 minutes, and subjectedto fluorography. A component of 53 Da was seen. This component was notseen when preimmune serum was used, or when 10 μg blocking peptide wasadded together with the antiserum. Moreover, it was not detectable insamples derived from untransfected COS-1 cells using either preimmuneserum or the antiserum.

Digestion with Endoalycosidase F

Samples immunoprecipitated with the VPN antisera obtained as describedabove were incubated with 0.5 U of endoglycosidase F (BoehringerMannheim Biochemica) in a buffer containing 100 mM sodium phosphate, pH6.1, 50 mM EDTA, 1% Triton X-100, 0.1% SDS and 1% β-mercaptoethanol at37° C. for 24 hours. Samples were eluted by boiling for 5 minutes in theSDS-sample buffer, and analyzed by SDS-polyacrylamide gelelectrophoresis as described above. Hydrolysis of N-linked carbohydratesby endoglycosidase F shifted the 53 kDa band to 51 kDa. The extracelluardomain of ALK-5 contains one potential acceptor site for N-glycosylationand the size of the deglycosylated protein is close to the predictedsize of the core protein.

Establishment of PAE Cell Lines Expressing ALK-5

In order to investigate whether the ALK-5 cDNA encodes a receptor forTGF-β, porcine aortic endothelial (PAE) cells were transfected with anexpression vector containing the ALK-5 cDNA, and analyzed for thebinding of ¹²⁵I-TGF-β1.

PAE cells were cultured in Ham's F-12 medium supplemented with 10% FBSand antibiotics (Miyazono et al (1988) J. Biol. Chem. 263, 6407-6415).The ALK-5 cDNA was cloned into the cytomegalovirus (CMV)-basedexpression vector pcDNA I/NEO (Invitrogen), and transfected into PAEcells by electroporation. After 48 hours, selection was initiated byadding Geneticin (G418 sulphate; Gibco-BRL) to the culture medium at afinal concentration of 0.5 mg/ml (Westermark et al., (1990) Proc. Natl.Acad. Sci. USA 87, 128-132). Several clones were obtained, and afteranalysis by immunoprecipitation using the VPN antiserum, one clonedenoted PAE/TβR-1 was chosen and further analyzed.

Iodination of TGF-β1, Binding and Affinity Crosslinking

Recombinant human TGF-β1 was iodinated using the chloramine T methodaccording to Frolik et al., (1984) J. Biol. Chem. 259, 10995-11000.Cross-linking experiments were performed as previously described (Ichijoet al., (1990) Exp. Cell Res. 187, 263-269). Briefly, cells in 6-wellplates were washed with binding buffer (phosphate-buffered salinecontaining 0.9 EM CaCl₂, 0.49 EM MgCl₂ and 1 mg/ml bovine serum albumin(BSA)), and incubated on ice in the same buffer with ¹²⁵I-TGF-β1 in thepresence or absence of excess unlabelled TGF-β1 for 3 hours. Cells werewashed and cross-linking was done in the binding buffer without BSAtogether with 0.28 mM disuccinimidyl suberate (DSS; Pierce Chemical Co.)for 15 minutes on ice. The cells were harvested by the addition of 1 mlof detachment buffer (10 mM Tris-HCl, pH 7.4, 1 mM EDTA, 10% glycerol,0.3 nM PMSF). The cells were pelleted by centrifugation, thenresuspended in 50 μl of solubilization buffer (125 mM NaCl, 10 mMTris-HCl, pH 7.4, 1 M EDTA, 1% Triton X-100, 0.3 mM PMSF, 1% Trasylol)and incubated for 40 minutes on ice. Cells were centrifuged again andsupernatants were subjected to analysis by SDS-gel electrophoresis using4-15% polyacrylamide gels, followed by autoradiography. ¹²⁵I-TGF-β1formed a 70 kDa crosslinked complex in the transfected PAE cells(PAE/TβR-I cells). The size of this complex was very similar to that ofthe TGF-β type I receptor complex observed at lower amounts in theuntransfected cells. A concomitant increase of 94 kDa TGF-β type IIreceptor complex could also be observed in the PAE/TβR-I cells.Components of 150-190 kDa, which may represent crosslinked complexesbetween the type I and type II receptors, were also observed in thePAE/TβR-I cells.

In order to determine whether the cross-linked 70 kDa complex containedthe protein encoded by the ALK-5 cDNA, the affinity cross-linking wasfollowed by immunoprecipitation using the VPN antiserum. For this, cellsin 25 cm² flasks were used. The supernatants obtained aftercross-linking were incubated with 7 μl of preimmune serum or VPNantiserum in the presence or absence of 10 μg of peptide for 1.5 h at 4°C. Immune complexes were then added to 50 μl of protein A-Sepharoseslurry and incubated for 45 minutes at 4° C. The protein A-Sepharosebeads were washed four times with the washing buffer, once withdistilled water, and the samples were analyzed by SDS-gelelectrophoresis using 4-15% polyacrylamide gradient gels andautoradiography. A 70 kDa cross-linked complex was precipitated by theVPN antiserum in PAE/TβR-1 cells, and a weaker band of the same size wasalso seen in the untransfected cells, indicating that the untransfectedPAE cells contained a low amount of endogenous ALK-5. The 70 kDa complexwas not observed when preimmune serum was used, or when immune serum wasblocked by 10 μg of peptide. Moreover, a coprecipitated 94 kDa componentcould also be observed in the PAE/TβR-I cells. The latter component islikely to represent a TGF-β type II receptor complex, since anantiserum, termed DRL, which was raised against a synthetic peptide fromthe C-terminal part of the TGF-β type II receptor, precipitated a 94 kDaTGF-β type II receptor complex, as well as a 70 kDa type I receptorcomplex from PAE/TβR-I cells.

The carbohydrate contents of ALK-5 and the TGF-β type II receptor werecharacterized by deglycosylation using endoglycosidase F as describedabove and analyzed by SDS-polyacrylamide gel electrophoresis andautoradiography. The ALK-5 cross-linked complex shifted from 70 kDa to66 kDa, whereas that of the type II receptor shifted from 94 kDa to 82kDa. The observed larger shift of the type II receptor band comparedwith that of the ALK-5 band is consistent with the deglycosylation dataof the type I and type II receptors on rat liver cells reportedpreviously (Cheifetz et al (1988) J. Biol. Chem. 263, 16984-16991), andfits well with the fact that the porcine TGF-β type II receptor has twoN-glycosylation sites (Lin et al (1992) Cell 68, 775-785), whereas ALK-5has only one (see SEQ ID No. 9).

Binding of TGF-β1 to the type I receptor is known to be abolished bytransient treatment of the cells with dithiothreitol (DTT) (Cheifetz andMassague (1991) J. Biol. Chem. 266, 20767-20772; Wrana et al (1992) Cell71, 1003-1014). When analyzed by affinity cross-linking, binding of¹²⁵I-TGF-β1 to ALK-5, but not to the type II receptor, was completelyabolished by DTT treatment of PAE/TβR-1 cells. Affinity cross-linkingfollowed by immunoprecipitation by the VPN antiserum showed that neitherthe ALK-5 nor the type II receptor complexes was precipitated after DTTtreatment, indicating that the VPN antiserum reacts only with ALK-5. Thedata show that the VPN antiserum recognizes a TGF-β type I receptor, andthat the type I and type II receptors form a heteromeric complex.

¹²⁵I-TGF-β1 Binding & Affinity Crosslinking of Transfected COS Cells

Transient expression plasmids of ALKs-1 to -6 and TβR-II were generatedby subcloning into the pSV7d expression vector or into the pcDNA Iexpression vector (Invitrogen). Transient transfection of COS-1 cellsand iodination of TGF-β1 were carried out as described above.Crosslinking and immunoprecipitation were performed as described for PAEcells above.

Transfection of cDNAs for ALKs into COS-1 cells did not show anyappreciable binding of ¹²⁵I-TGFβ1, consistent with the observation thattype I receptors do not bind TGF-β in the absence of type II receptors.When the TβR-β cDNA was co-transfected with cDNAs for the differentALKs, type I receptor-like complexes were seen, at different levels, ineach case. COS-1 cells transfected with TβR-II and ALK cDNAs wereanalyzed by affinity crosslinking followed by immunoprecipitation usingthe DRL antisera or specific antisera against ALKs. Each one of the ALKsbound ¹²⁵I-TGF-β1 and was coimmunoprecipitated with the TβR-II complexusing the DRL antiserum. Comparison of the efficiency of the differentALKs to form heteromeric complexes with TβR-II, revealed that ALK-5formed such complexes more efficiently than the other ALKs. The size ofthe crosslinked complex was larger for ALK-3 than for other ALKs,consistent with its slightly larger size.

Expression of the ALK Protein in Different Cell Types

Two different approaches were used to elucidate which ALK's arephysiological type I receptors for TGF-β.

Firstly, several cell lines were tested for the expression of the ALKproteins by cross-linking followed by immunoprecipitation using thespecific antiseras against ALKs and the TGF-β type II receptor. The minklung epithelial cell line, Mv1Lu, is widely used to provide target cellsfor TGF-β action and is well characterized regarding TGF-β receptors(Laiho et al (1990) J. Biol. Chem. 265, 18518-18524; Laiho et al (1991)J. Biol. Chem. 266, 9108-9112). Only the VPN antiserum efficientlyprecipitated both type I and type II TGF-β receptors in the wild typeMv1Lu cells. The DRL antiserum also precipitated components with thesame size as those precipitated by the VPN antiserum. A mutant cell line(R mutant) which lacks the TGF-β type I receptor and does not respond toTGF-β (Laiho et al, supra) was also investigated by cross-linkingfollowed by immunoprecipitation. Consistent with the results obtained byLaiho et al (1990), supra the type III and type II TGF-β receptorcomplexes, but not the type I receptor complex, were observed byaffinity crosslinking. crosslinking followed by immunoprecipatitionusing the DRL antiserum revealed only the type II receptor complex,whereas neither the type I nor type II receptor complexes was seen usingthe VPN antiserum. When the cells were metabolically labelled andsubjected to immunoprecipitation using the VPN antiserum, the 53 kDaALK-5 protein was precipitated in both the wild-type and R mutant Mv1Lucells. These results suggest that the type I receptor expressed in the Rmutant is ALK-5, which has lost the affinity for binding to TGF-β aftermutation.

The type I and type II TGF-β receptor complexes could be precipitated bythe VPN and DRL antisera in other cell lines, including human foreskinfibroblasts (AG1518), human lung adenocarcinoma cells (A549), and humanoral squamous cell carcinoma cells (HSC-2). Affinity cross-linkingstudies revealed multiple TGF-β type I receptor-like complexes of 70-77kDa in these cells. These components were less efficiently competed byexcess unlabelled TGF-β1 in HSC-2 cells. Moreover, the type II receptorcomplex was low or not detectable in A549 and HSC-2 cells. Crosslinkingfollowed by immunoprecipitation revealed that the VPN antiserumprecipitated only the 70 kDa complex among the 70-77 kDa components. TheDRL antiserum precipitated the 94 kDa type II receptor complex as wellas the 70 kDa type I receptor complex in these cells, but not theputative type I receptor complexes of slightly larger sizes. Theseresults suggest that multiple type I TGF-β receptors may exist and thatthe 70 kDa complex containing ALK-5 forms a heteromeric complex with theTGF-β type II receptor cloned by Lin et al (1992) Cell 68, 775-785, moreefficiently that the other species. In rat pheochromocytoma cells (PC12)which have been reported to have no TGF-β receptor complexes by affinitycross-linking (Massagué et al (1990) Ann. N.Y. Acad. Sci. 593, 59-72),neither VPN nor DRL antisera precipitated the TGF-β receptor complexes.The antisera against ALKs-1 to -4 and ALK6 did not efficientlyimmunoprecipitate the crosslinked receptor complexes in porcine aorticendothelial (PAE) cells or human foreskin fibroblasts.

Next, it was investigated whether ALKs could restore responsiveness toTGF-β in the R mutant of Mv1Lu cells, which lack the ligand-bindingability of the TGF-β type I receptor but have intact type II receptor.Wild-type Mv1Lu cells and mutant cells were transfected with ALK cDNAand were then assayed for the production of plasminogen activatorinhibitor-1 (PAI-1) which is produced as a result of TGF-β receptoractivation as described previously by Laiho et al (1991) Mol. Cell Biol.11, 972-978. Briefly, cells were added with or without 10 ng/ml ofTGF-β1 for 2 hours in serum-free MCDB 104 without methionine.Thereafter, cultures were labelled with [³⁵S] methionine (40 μCi/ml) for2 hours. The cells were removed by washing on ice once in PBS, twice in10 mM Tris-HCl (pH 8.0), 0.5% sodium deoxycholate, 1 mM PMSF, twice in 2mM Tris-HCl (pH 8.0), and once in PBS. Extracellular matrix proteinswere extracted by scraping cells into the SDS-sample buffer containingDTT, and analyzed by SDS-gel electrophoresis followed by fluorographyusing Amplify. PAI-1 can be identified as a characteristic 45kDa band(Laiho et al (1991) Mol. Cell Biol. 11, 972-978). Wild-type Mv1Lu cellsresponded to TGF-β and produced PAI-1, whereas the R mutant clone didnot, even after stimulation by TGF-β1. Transient transfection of theALK-5 cDNA into the R mutant clone led to the production of PAI-1 inresponse to the stimulation by TGF-β1, indicating that the ALK-5 cDNAencodes a functional TGF-β type I receptor. In contrast, the R mutantcells that were transfected with other ALKs did not produce PAI-1 uponthe addition of TGF-β1.

Using similar approaches as those described above for the identificationof TGF-β-binding ALKs, the ability of ALKs to bind activin in thepresence of ActRII was examined. COS-1 cells were co-transfected asdescribed above. Recombinant human activin A was iodinated using thechloramine T method (Mathews and Vale (1991) Cell 65, 973-982).Transfected COS-1 cells were analysed for binding and crosslinking of¹²⁵I-activin A in the presence or absence of excess unlabelled activinA. The crosslinked complexes were subjected to immunoprecipitation usingDRL antisera or specific ALK antisera.

All ALKs appear to bind activin A in the presence of Act R-II. This ismore clearly demonstrated by affinity cross-linking followed byimmunopreciptation. ALK-2 and ALK-4 bound ¹²⁵I-activin A and werecoimmunoprecipitated with ActR-II. Other ALKs also bound ¹²⁵I-activin Abut with a lower efficiency compared to ALK-2 and ALK-4.

In order to investigate whether ALKs are physiological activin type Ireceptors, activin responsive cells were examined for the expression ofendogenous activin type I receptors. Mv1Lu cells, as well as the Rmutant, express both type I and type II receptors for activin, and the Rmutant cells produce PAI-1 upon the addition of activin A.

Mv1Lu cells were labeled with ¹²⁵I-activin A, cross-linked andimmunoprecipitated by the antisera against ActR-II or ALKs as describedabove.

The type I and type II receptor complexes in Mv1Lu cells wereimmunoprecipitated only by the antisera against ALK-2, ALK-4 andActR-II. Similar results were obtained using the R mutant cells. PAEcells do not bind activin because of the lack of type II receptors foractivin, and so cells were transfected with a chimeric receptor, toenable them to bind activin, as described herein. A plasmid (chim A)containing the extracelluar domain and C-terminal tail of Act R-II(amino-acids −19 to 116 and 465 to 494, respectively (Mathews and Vale(1991) Cell, 65, 973-982)) and the kinase domain of TβR-II (amino-acids160-543) (Lin et al (1992) Cell, 68, 775-785) was constructed andtransfected into pcDNA/neo (Invitrogen). PAE cells were stablytransfected with the chim A plasmid by electroporation, and cellsexpressing the chim A protein were established as described previously.PAE/Chim A cells were then subjected to ¹²⁵I-activin A labellingcrosslinking and immunoprecipitation as described above.

Similar to Mv1Lu cells, activin type I receptor complexes in PAE/Chim Acells were immunoprecipitated by the ALK-2 and ALK-4 antisera. Theseresults show that both ALK-2 and ALK-4 serve as high affinity type Ireceptors for activin A in these cells.

ALK-1, ALK-3 and ALK-6 bind TGF-β1 and activin A in the presence oftheir respective type II receptors, but the functional consequences ofthe binding of the ligands remains to be elucidated.

The invention has been described by way of example only, withoutrestriction of its scope. The invention is defined by the subject matterherein, including the claims that follow the immediately following fullSequence Listings.

29 1984 base pairs nucleic acid unknown linear cDNA NO NO internal Homosapiens CDS 283..1791 1 AGGAAACGGT TTATTAGGAG GGAGTGGTGG AGCTGGGCCAGGCAGGAAGA CGCTGGAATA 60 AGAAACATTT TTGCTCCAGC CCCCATCCCA GTCCCGGGAGGCTGCCGCGC CAGCTGCGCC 120 GAGCGAGCCC CTCCCCGGCT CCAGCCCGGT CCGGGGCCGCGCCGGACCCC AGCCCGCCGT 180 CCAGCGCTGG CGGTGCAACT GCGGCCGCGC GGTGGAGGGGAGGTGGCCCC GGTCCGCCGA 240 AGGCTAGCGC CCCGCCACCC GCAGAGCGGG CCCAGAGGGA CCATG ACC TTG GGC 294 Met Thr Leu Gly 1 TCC CCC AGG AAA GGC CTT CTG ATGCTG CTG ATG GCC TTG GTG ACC CAG 342 Ser Pro Arg Lys Gly Leu Leu Met LeuLeu Met Ala Leu Val Thr Gln 5 10 15 20 GGA GAC CCT GTG AAG CCG TCT CGGGGC CCG CTG GTG ACC TGC ACG TGT 390 Gly Asp Pro Val Lys Pro Ser Arg GlyPro Leu Val Thr Cys Thr Cys 25 30 35 GAG AGC CCA CAT TGC AAG GGG CCT ACCTGC CGG GGG GCC TGG TGC ACA 438 Glu Ser Pro His Cys Lys Gly Pro Thr CysArg Gly Ala Trp Cys Thr 40 45 50 GTA GTG CTG GTG CGG GAG GAG GGG AGG CACCCC CAG GAA CAT CGG GGC 486 Val Val Leu Val Arg Glu Glu Gly Arg His ProGln Glu His Arg Gly 55 60 65 TGC GGG AAC TTG CAC AGG GAG CTC TGC AGG GGGCGC CCC ACC GAG TTC 534 Cys Gly Asn Leu His Arg Glu Leu Cys Arg Gly ArgPro Thr Glu Phe 70 75 80 GTC AAC CAC TAC TGC TGC GAC AGC CAC CTC TGC AACCAC AAC GTG TCC 582 Val Asn His Tyr Cys Cys Asp Ser His Leu Cys Asn HisAsn Val Ser 85 90 95 100 CTG GTG CTG GAG GCC ACC CAA CCT CCT TCG GAG CAGCCG GGA ACA GAT 630 Leu Val Leu Glu Ala Thr Gln Pro Pro Ser Glu Gln ProGly Thr Asp 105 110 115 GGC CAG CTG GCC CTG ATC CTG GGC CCC GTG CTG GCCTTG CTG GCC CTG 678 Gly Gln Leu Ala Leu Ile Leu Gly Pro Val Leu Ala LeuLeu Ala Leu 120 125 130 GTG GCC CTG GGT GTC CTG GGC CTG TGG CAT GTC CGACGG AGG CAG GAG 726 Val Ala Leu Gly Val Leu Gly Leu Trp His Val Arg ArgArg Gln Glu 135 140 145 AAG CAG CGT GGC CTG CAC AGC GAG CTG GGA GAG TCCAGT CTC ATC CTG 774 Lys Gln Arg Gly Leu His Ser Glu Leu Gly Glu Ser SerLeu Ile Leu 150 155 160 AAA GCA TCT GAG CAG GGC GAC ACG ATG TTG GGG GACCTC CTG GAC AGT 822 Lys Ala Ser Glu Gln Gly Asp Thr Met Leu Gly Asp LeuLeu Asp Ser 165 170 175 180 GAC TGC ACC ACA GGG AGT GGC TCA GGG CTC CCCTTC CTG GTG CAG AGG 870 Asp Cys Thr Thr Gly Ser Gly Ser Gly Leu Pro PheLeu Val Gln Arg 185 190 195 ACA GTG GCA CGG CAG GTT GCC TTG GTG GAG TGTGTG GGA AAA GGC CGC 918 Thr Val Ala Arg Gln Val Ala Leu Val Glu Cys ValGly Lys Gly Arg 200 205 210 TAT GGC GAA GTG TGG CGG GGC TTG TGG CAC GGTGAG AGT GTG GCC GTC 966 Tyr Gly Glu Val Trp Arg Gly Leu Trp His Gly GluSer Val Ala Val 215 220 225 AAG ATC TTC TCC TCG AGG GAT GAA CAG TCC TGGTTC CGG GAG ACT GAG 1014 Lys Ile Phe Ser Ser Arg Asp Glu Gln Ser Trp PheArg Glu Thr Glu 230 235 240 ATC TAT AAC ACA GTA TTG CTC AGA CAC GAC AACATC CTA GGC TTC ATC 1062 Ile Tyr Asn Thr Val Leu Leu Arg His Asp Asn IleLeu Gly Phe Ile 245 250 255 260 GCC TCA GAC ATG ACC TCC CGC AAC TCG AGCACG CAG CTG TGG CTC ATC 1110 Ala Ser Asp Met Thr Ser Arg Asn Ser Ser ThrGln Leu Trp Leu Ile 265 270 275 ACG CAC TAC CAC GAG CAC GGC TCC CTC TACGAC TTT CTG CAG AGA CAG 1158 Thr His Tyr His Glu His Gly Ser Leu Tyr AspPhe Leu Gln Arg Gln 280 285 290 ACG CTG GAG CCC CAT CTG GCT CTG AGG CTAGCT GTG TCC GCG GCA TGC 1206 Thr Leu Glu Pro His Leu Ala Leu Arg Leu AlaVal Ser Ala Ala Cys 295 300 305 GGC CTG GCG CAC CTG CAC GTG GAG ATC TTCGGT ACA CAG GGC AAA CCA 1254 Gly Leu Ala His Leu His Val Glu Ile Phe GlyThr Gln Gly Lys Pro 310 315 320 GCC ATT GCC CAC CGC GAC TTC AAG AGC CGCAAT GTG CTG GTC AAG AGC 1302 Ala Ile Ala His Arg Asp Phe Lys Ser Arg AsnVal Leu Val Lys Ser 325 330 335 340 AAC CTG CAG TGT TGC ATC GCC GAC CTGGGC CTG GCT GTG ATG CAC TCA 1350 Asn Leu Gln Cys Cys Ile Ala Asp Leu GlyLeu Ala Val Met His Ser 345 350 355 CAG GGC AGC GAT TAC CTG GAC ATC GGCAAC AAC CCG AGA GTG GGC ACC 1398 Gln Gly Ser Asp Tyr Leu Asp Ile Gly AsnAsn Pro Arg Val Gly Thr 360 365 370 AAG CGG TAC ATG GCA CCC GAG GTG CTGGAC GAG CAG ATC CGC ACG GAC 1446 Lys Arg Tyr Met Ala Pro Glu Val Leu AspGlu Gln Ile Arg Thr Asp 375 380 385 TGC TTT GAG TCC TAC AAG TGG ACT GACATC TGG GCC TTT GGC CTG GTG 1494 Cys Phe Glu Ser Tyr Lys Trp Thr Asp IleTrp Ala Phe Gly Leu Val 390 395 400 CTG TGG GAG ATT GCC CGC CGG ACC ATCGTG AAT GGC ATC GTG GAG GAC 1542 Leu Trp Glu Ile Ala Arg Arg Thr Ile ValAsn Gly Ile Val Glu Asp 405 410 415 420 TAT AGA CCA CCC TTC TAT GAT GTGGTG CCC AAT GAC CCC AGC TTT GAG 1590 Tyr Arg Pro Pro Phe Tyr Asp Val ValPro Asn Asp Pro Ser Phe Glu 425 430 435 GAC ATG AAG AAG GTG GTG TGT GTGGAT CAG CAG ACC CCC ACC ATC CCT 1638 Asp Met Lys Lys Val Val Cys Val AspGln Gln Thr Pro Thr Ile Pro 440 445 450 AAC CGG CTG GCT GCA GAC CCG GTCCTC TCA GGC CTA GCT CAG ATG ATG 1686 Asn Arg Leu Ala Ala Asp Pro Val LeuSer Gly Leu Ala Gln Met Met 455 460 465 CGG GAG TGC TGG TAC CCA AAC CCCTCT GCC CGA CTC ACC GCG CTG CGG 1734 Arg Glu Cys Trp Tyr Pro Asn Pro SerAla Arg Leu Thr Ala Leu Arg 470 475 480 ATC AAG AAG ACA CTA CAA AAA ATTAGC AAC AGT CCA GAG AAG CCT AAA 1782 Ile Lys Lys Thr Leu Gln Lys Ile SerAsn Ser Pro Glu Lys Pro Lys 485 490 495 500 GTG ATT CAA TAGCCCAGGAGCACCTGATT CCTTTCTGCC TGCAGGGGGC 1831 Val Ile Gln TGGGGGGGTG GGGGGCAGTGGATGGTGCCC TATCTGGGTA GAGGTAGTGT GAGTGTGGTG 1891 TGTGCTGGGG ATGGGCAGCTGCGCCTGCCT GCTCGGCCCC CAGCCCACCC AGCCAAAAAT 1951 ACAGCTGGGC TGAAACCTGAAAAAAAAAAA AAA 1984 503 amino acids amino acid linear protein notprovided 2 Met Thr Leu Gly Ser Pro Arg Lys Gly Leu Leu Met Leu Leu MetAla 1 5 10 15 Leu Val Thr Gln Gly Asp Pro Val Lys Pro Ser Arg Gly ProLeu Val 20 25 30 Thr Cys Thr Cys Glu Ser Pro His Cys Lys Gly Pro Thr CysArg Gly 35 40 45 Ala Trp Cys Thr Val Val Leu Val Arg Glu Glu Gly Arg HisPro Gln 50 55 60 Glu His Arg Gly Cys Gly Asn Leu His Arg Glu Leu Cys ArgGly Arg 65 70 75 80 Pro Thr Glu Phe Val Asn His Tyr Cys Cys Asp Ser HisLeu Cys Asn 85 90 95 His Asn Val Ser Leu Val Leu Glu Ala Thr Gln Pro ProSer Glu Gln 100 105 110 Pro Gly Thr Asp Gly Gln Leu Ala Leu Ile Leu GlyPro Val Leu Ala 115 120 125 Leu Leu Ala Leu Val Ala Leu Gly Val Leu GlyLeu Trp His Val Arg 130 135 140 Arg Arg Gln Glu Lys Gln Arg Gly Leu HisSer Glu Leu Gly Glu Ser 145 150 155 160 Ser Leu Ile Leu Lys Ala Ser GluGln Gly Asp Thr Met Leu Gly Asp 165 170 175 Leu Leu Asp Ser Asp Cys ThrThr Gly Ser Gly Ser Gly Leu Pro Phe 180 185 190 Leu Val Gln Arg Thr ValAla Arg Gln Val Ala Leu Val Glu Cys Val 195 200 205 Gly Lys Gly Arg TyrGly Glu Val Trp Arg Gly Leu Trp His Gly Glu 210 215 220 Ser Val Ala ValLys Ile Phe Ser Ser Arg Asp Glu Gln Ser Trp Phe 225 230 235 240 Arg GluThr Glu Ile Tyr Asn Thr Val Leu Leu Arg His Asp Asn Ile 245 250 255 LeuGly Phe Ile Ala Ser Asp Met Thr Ser Arg Asn Ser Ser Thr Gln 260 265 270Leu Trp Leu Ile Thr His Tyr His Glu His Gly Ser Leu Tyr Asp Phe 275 280285 Leu Gln Arg Gln Thr Leu Glu Pro His Leu Ala Leu Arg Leu Ala Val 290295 300 Ser Ala Ala Cys Gly Leu Ala His Leu His Val Glu Ile Phe Gly Thr305 310 315 320 Gln Gly Lys Pro Ala Ile Ala His Arg Asp Phe Lys Ser ArgAsn Val 325 330 335 Leu Val Lys Ser Asn Leu Gln Cys Cys Ile Ala Asp LeuGly Leu Ala 340 345 350 Val Met His Ser Gln Gly Ser Asp Tyr Leu Asp IleGly Asn Asn Pro 355 360 365 Arg Val Gly Thr Lys Arg Tyr Met Ala Pro GluVal Leu Asp Glu Gln 370 375 380 Ile Arg Thr Asp Cys Phe Glu Ser Tyr LysTrp Thr Asp Ile Trp Ala 385 390 395 400 Phe Gly Leu Val Leu Trp Glu IleAla Arg Arg Thr Ile Val Asn Gly 405 410 415 Ile Val Glu Asp Tyr Arg ProPro Phe Tyr Asp Val Val Pro Asn Asp 420 425 430 Pro Ser Phe Glu Asp MetLys Lys Val Val Cys Val Asp Gln Gln Thr 435 440 445 Pro Thr Ile Pro AsnArg Leu Ala Ala Asp Pro Val Leu Ser Gly Leu 450 455 460 Ala Gln Met MetArg Glu Cys Trp Tyr Pro Asn Pro Ser Ala Arg Leu 465 470 475 480 Thr AlaLeu Arg Ile Lys Lys Thr Leu Gln Lys Ile Ser Asn Ser Pro 485 490 495 GluLys Pro Lys Val Ile Gln 500 2724 base pairs nucleic acid unknown linearcDNA NO NO internal Homo sapiens CDS 104..1630 3 CTCCGAGTAC CCCAGTGACCAGAGTGAGAG AAGCTCTGAA CGAGGGCACG CGGCTTGAAG 60 GACTGTGGGC AGATGTGACCAAGAGCCTGC ATTAAGTTGT ACA ATG GTA GAT GGA 115 Met Val Asp Gly 1 GTG ATGATT CTT CCT GTG CTT ATC ATG ATT GCT CTC CCC TCC CCT AGT 163 Val Met IleLeu Pro Val Leu Ile Met Ile Ala Leu Pro Ser Pro Ser 5 10 15 20 ATG GAAGAT GAG AAG CCC AAG GTC AAC CCC AAA CTC TAC ATG TGT GTG 211 Met Glu AspGlu Lys Pro Lys Val Asn Pro Lys Leu Tyr Met Cys Val 25 30 35 TGT GAA GGTCTC TCC TGC GGT AAT GAG GAC CAC TGT GAA GGC CAG CAG 259 Cys Glu Gly LeuSer Cys Gly Asn Glu Asp His Cys Glu Gly Gln Gln 40 45 50 TGC TTT TCC TCACTG AGC ATC AAC GAT GGC TTC CAC GTC TAC CAG AAA 307 Cys Phe Ser Ser LeuSer Ile Asn Asp Gly Phe His Val Tyr Gln Lys 55 60 65 GGC TGC TTC CAG GTTTAT GAG CAG GGA AAG ATG ACC TGT AAG ACC CCG 355 Gly Cys Phe Gln Val TyrGlu Gln Gly Lys Met Thr Cys Lys Thr Pro 70 75 80 CCG TCC CCT GGC CAA GCTGTG GAG TGC TGC CAA GGG GAC TGG TGT AAC 403 Pro Ser Pro Gly Gln Ala ValGlu Cys Cys Gln Gly Asp Trp Cys Asn 85 90 95 100 AGG AAC ATC ACG GCC CAGCTG CCC ACT AAA GGA AAA TCC TTC CCT GGA 451 Arg Asn Ile Thr Ala Gln LeuPro Thr Lys Gly Lys Ser Phe Pro Gly 105 110 115 ACA CAG AAT TTC CAC TTGGAG GTT GGC CTC ATT ATT CTC TCT GTA GTG 499 Thr Gln Asn Phe His Leu GluVal Gly Leu Ile Ile Leu Ser Val Val 120 125 130 TTC GCA GTA TGT CTT TTAGCC TGC CTG CTG GGA GTT GCT CTC CGA AAA 547 Phe Ala Val Cys Leu Leu AlaCys Leu Leu Gly Val Ala Leu Arg Lys 135 140 145 TTT AAA AGG CGC AAC CAAGAA CGC CTC AAT CCC CGA GAC GTG GAG TAT 595 Phe Lys Arg Arg Asn Gln GluArg Leu Asn Pro Arg Asp Val Glu Tyr 150 155 160 GGC ACT ATC GAA GGG CTCATC ACC ACC AAT GTT GGA GAC AGC ACT TTA 643 Gly Thr Ile Glu Gly Leu IleThr Thr Asn Val Gly Asp Ser Thr Leu 165 170 175 180 GCA GAT TTA TTG GATCAT TCG TGT ACA TCA GGA AGT GGC TCT GGT CTT 691 Ala Asp Leu Leu Asp HisSer Cys Thr Ser Gly Ser Gly Ser Gly Leu 185 190 195 CCT TTT CTG GTA CAAAGA ACA GTG GCT CGC CAG ATT ACA CTG TTG GAG 739 Pro Phe Leu Val Gln ArgThr Val Ala Arg Gln Ile Thr Leu Leu Glu 200 205 210 TGT GTC GGG AAA GGCAGG TAT GGT GAG GTG TGG AGG GGC AGC TGG CAA 787 Cys Val Gly Lys Gly ArgTyr Gly Glu Val Trp Arg Gly Ser Trp Gln 215 220 225 GGG GAA AAT GTT GCCGTG AAG ATC TTC TCC TCC CGT GAT GAG AAG TCA 835 Gly Glu Asn Val Ala ValLys Ile Phe Ser Ser Arg Asp Glu Lys Ser 230 235 240 TGG TTC AGG GAA ACGGAA TTG TAC AAC ACT GTG ATG CTG AGG CAT GAA 883 Trp Phe Arg Glu Thr GluLeu Tyr Asn Thr Val Met Leu Arg His Glu 245 250 255 260 AAT ATC TTA GGTTTC ATT GCT TCA GAC ATG ACA TCA AGA CAC TCC AGT 931 Asn Ile Leu Gly PheIle Ala Ser Asp Met Thr Ser Arg His Ser Ser 265 270 275 ACC CAG CTG TGGTTA ATT ACA CAT TAT CAT GAA ATG GGA TCG TTG TAC 979 Thr Gln Leu Trp LeuIle Thr His Tyr His Glu Met Gly Ser Leu Tyr 280 285 290 GAC TAT CTT CAGCTT ACT ACT CTG GAT ACA GTT AGC TGC CTT CGA ATA 1027 Asp Tyr Leu Gln LeuThr Thr Leu Asp Thr Val Ser Cys Leu Arg Ile 295 300 305 GTG CTG TCC ATAGCT AGT GGT CTT GCA CAT TTG CAC ATA GAG ATA TTT 1075 Val Leu Ser Ile AlaSer Gly Leu Ala His Leu His Ile Glu Ile Phe 310 315 320 GGG ACC CAA GGGAAA CCA GCC ATT GCC CAT CGA GAT TTA AAG AGC AAA 1123 Gly Thr Gln Gly LysPro Ala Ile Ala His Arg Asp Leu Lys Ser Lys 325 330 335 340 AAT ATT CTGGTT AAG AAG AAT GGA CAG TGT TGC ATA GCA GAT TTG GGC 1171 Asn Ile Leu ValLys Lys Asn Gly Gln Cys Cys Ile Ala Asp Leu Gly 345 350 355 CTG GCA GTCATG CAT TCC CAG AGC ACC AAT CAG CTT GAT GTG GGG AAC 1219 Leu Ala Val MetHis Ser Gln Ser Thr Asn Gln Leu Asp Val Gly Asn 360 365 370 AAT CCC CGTGTG GGC ACC AAG CGC TAC ATG GCC CCC GAA GTT CTA GAT 1267 Asn Pro Arg ValGly Thr Lys Arg Tyr Met Ala Pro Glu Val Leu Asp 375 380 385 GAA ACC ATCCAG GTG GAT TGT TTC GAT TCT TAT AAA AGG GTC GAT ATT 1315 Glu Thr Ile GlnVal Asp Cys Phe Asp Ser Tyr Lys Arg Val Asp Ile 390 395 400 TGG GCC TTTGGA CTT GTT TTG TGG GAA GTG GCC AGG CGG ATG GTG AGC 1363 Trp Ala Phe GlyLeu Val Leu Trp Glu Val Ala Arg Arg Met Val Ser 405 410 415 420 AAT GGTATA GTG GAG GAT TAC AAG CCA CCG TTC TAC GAT GTG GTT CCC 1411 Asn Gly IleVal Glu Asp Tyr Lys Pro Pro Phe Tyr Asp Val Val Pro 425 430 435 AAT GACCCA AGT TTT GAA GAT ATG AGG AAG GTA GTC TGT GTG GAT CAA 1459 Asn Asp ProSer Phe Glu Asp Met Arg Lys Val Val Cys Val Asp Gln 440 445 450 CAA AGGCCA AAC ATA CCC AAC AGA TGG TTC TCA GAC CCG ACA TTA ACC 1507 Gln Arg ProAsn Ile Pro Asn Arg Trp Phe Ser Asp Pro Thr Leu Thr 455 460 465 TCT CTGGCC AAG CTA ATG AAA GAA TGC TGG TAT CAA AAT CCA TCC GCA 1555 Ser Leu AlaLys Leu Met Lys Glu Cys Trp Tyr Gln Asn Pro Ser Ala 470 475 480 AGA CTCACA GCA CTG CGT ATC AAA AAG ACT TTG ACC AAA ATT GAT AAT 1603 Arg Leu ThrAla Leu Arg Ile Lys Lys Thr Leu Thr Lys Ile Asp Asn 485 490 495 500 TCCCTC GAC AAA TTG AAA ACT GAC TGT TGACATTTTC ATAGTGTCAA 1650 Ser Leu AspLys Leu Lys Thr Asp Cys 505 GAAGGAAGAT TTGACGTTGT TGTCATTGTC CAGCTGGGACCTAATGCTGG CCTGACTGGT 1710 TGTCAGAATG GAATCCATCT GTCTCCCTCC CCAAATGGCTGCTTTGACAA GGCAGACGTC 1770 GTACCCAGCC ATGTGTTGGG GAGACATCAA AACCACCCTAACCTCGCTCG ATGACTGTGA 1830 ACTGGGCATT TCACGAACTG TTCACACTGC AGAGACTAATGTTGGACAGA CACTGTTGCA 1890 AAGGTAGGGA CTGGAGGAAC ACAGAGAAAT CCTAAAAGAGATCTGGGCAT TAAGTCAGTG 1950 GCTTTGCATA GCTTTCACAA GTCTCCTAGA CACTCCCCACGGGAAACTCA AGGAGGTGGT 2010 GAATTTTTAA TCAGCAATAT TGCCTGTGCT TCTCTTCTTTATTGCACTAG GAATTCTTTG 2070 CATTCCTTAC TTGCACTGTT ACTCTTAATT TTAAAGACCCAACTTGCCAA AATGTTGGCT 2130 GCGTACTCCA CTGGTCTGTC TTTGGATAAT AGGAATTCAATTTGGCAAAA CAAAATGTAA 2190 TGTCAGACTT TGCTGCATTT TACACATGTG CTGATGTTTACAATGATGCC GAACATTAGG 2250 AATTGTTTAT ACACAACTTT GCAAATTATT TATTACTTGTGCACTTAGTA GTTTTTACAA 2310 AACTGCTTTG TGCATATGTT AAAGCTTATT TTTATGTGGTCTTATGATTT TATTACAGAA 2370 ATGTTTTTAA CACTATACTC TAAAATGGAC ATTTTCTTTTATTATCAGTT AAAATCACAT 2430 TTTAAGTGCT TCACATTTGT ATGTGTGTAG ACTGTAACTTTTTTTCAGTT CATATGCAGA 2490 ACGTATTTAG CCATTACCCA CGTGACACCA CCGAATATATTATCGATTTA GAAGCAAAGA 2550 TTTCAGTAGA ATTTTAGTCC TGAACGCTAC GGGGAAAATGCATTTTCTTC AGAATTATCC 2610 ATTACGTGCA TTTAAACTCT GCCAGAAAAA AATAACTATTTTGTTTTAAT CTACTTTTTG 2670 TATTTAGTAG TTATTTGTAT AAATTAAATA AACTGTTTTCAAGTCAAAAA AAAA 2724 509 amino acids amino acid linear protein notprovided 4 Met Val Asp Gly Val Met Ile Leu Pro Val Leu Ile Met Ile AlaLeu 1 5 10 15 Pro Ser Pro Ser Met Glu Asp Glu Lys Pro Lys Val Asn ProLys Leu 20 25 30 Tyr Met Cys Val Cys Glu Gly Leu Ser Cys Gly Asn Glu AspHis Cys 35 40 45 Glu Gly Gln Gln Cys Phe Ser Ser Leu Ser Ile Asn Asp GlyPhe His 50 55 60 Val Tyr Gln Lys Gly Cys Phe Gln Val Tyr Glu Gln Gly LysMet Thr 65 70 75 80 Cys Lys Thr Pro Pro Ser Pro Gly Gln Ala Val Glu CysCys Gln Gly 85 90 95 Asp Trp Cys Asn Arg Asn Ile Thr Ala Gln Leu Pro ThrLys Gly Lys 100 105 110 Ser Phe Pro Gly Thr Gln Asn Phe His Leu Glu ValGly Leu Ile Ile 115 120 125 Leu Ser Val Val Phe Ala Val Cys Leu Leu AlaCys Leu Leu Gly Val 130 135 140 Ala Leu Arg Lys Phe Lys Arg Arg Asn GlnGlu Arg Leu Asn Pro Arg 145 150 155 160 Asp Val Glu Tyr Gly Thr Ile GluGly Leu Ile Thr Thr Asn Val Gly 165 170 175 Asp Ser Thr Leu Ala Asp LeuLeu Asp His Ser Cys Thr Ser Gly Ser 180 185 190 Gly Ser Gly Leu Pro PheLeu Val Gln Arg Thr Val Ala Arg Gln Ile 195 200 205 Thr Leu Leu Glu CysVal Gly Lys Gly Arg Tyr Gly Glu Val Trp Arg 210 215 220 Gly Ser Trp GlnGly Glu Asn Val Ala Val Lys Ile Phe Ser Ser Arg 225 230 235 240 Asp GluLys Ser Trp Phe Arg Glu Thr Glu Leu Tyr Asn Thr Val Met 245 250 255 LeuArg His Glu Asn Ile Leu Gly Phe Ile Ala Ser Asp Met Thr Ser 260 265 270Arg His Ser Ser Thr Gln Leu Trp Leu Ile Thr His Tyr His Glu Met 275 280285 Gly Ser Leu Tyr Asp Tyr Leu Gln Leu Thr Thr Leu Asp Thr Val Ser 290295 300 Cys Leu Arg Ile Val Leu Ser Ile Ala Ser Gly Leu Ala His Leu His305 310 315 320 Ile Glu Ile Phe Gly Thr Gln Gly Lys Pro Ala Ile Ala HisArg Asp 325 330 335 Leu Lys Ser Lys Asn Ile Leu Val Lys Lys Asn Gly GlnCys Cys Ile 340 345 350 Ala Asp Leu Gly Leu Ala Val Met His Ser Gln SerThr Asn Gln Leu 355 360 365 Asp Val Gly Asn Asn Pro Arg Val Gly Thr LysArg Tyr Met Ala Pro 370 375 380 Glu Val Leu Asp Glu Thr Ile Gln Val AspCys Phe Asp Ser Tyr Lys 385 390 395 400 Arg Val Asp Ile Trp Ala Phe GlyLeu Val Leu Trp Glu Val Ala Arg 405 410 415 Arg Met Val Ser Asn Gly IleVal Glu Asp Tyr Lys Pro Pro Phe Tyr 420 425 430 Asp Val Val Pro Asn AspPro Ser Phe Glu Asp Met Arg Lys Val Val 435 440 445 Cys Val Asp Gln GlnArg Pro Asn Ile Pro Asn Arg Trp Phe Ser Asp 450 455 460 Pro Thr Leu ThrSer Leu Ala Lys Leu Met Lys Glu Cys Trp Tyr Gln 465 470 475 480 Asn ProSer Ala Arg Leu Thr Ala Leu Arg Ile Lys Lys Thr Leu Thr 485 490 495 LysIle Asp Asn Ser Leu Asp Lys Leu Lys Thr Asp Cys 500 505 2932 base pairsnucleic acid unknown linear cDNA NO NO internal Homo sapiens CDS310..1905 5 GCTCCGCGCC GAGGGCTGGA GGATGCGTTC CCTGGGGTCC GGACTTATGAAAATATGCAT 60 CAGTTTAATA CTGTCTTGGA ATTCATGAGA TGGAAGCATA GGTCAAAGCTGTTTGGAGAA 120 AATCAGAAGT ACAGTTTTAT CTAGCCACAT CTTGGAGGAG TCGTAAGAAAGCAGTGGGAG 180 TTGAAGTCAT TGTCAAGTGC TTGCGATCTT TTACAAGAAA ATCTCACTGAATGATAGTCA 240 TTTAAATTGG TGAAGTAGCA AGACCAATTA TTAAAGGTGA CAGTACACAGGAAACATTAC 300 AATTGAACA ATG ACT CAG CTA TAC ATT TAC ATC AGA TTA TTG GGAGCC 348 Met Thr Gln Leu Tyr Ile Tyr Ile Arg Leu Leu Gly Ala 1 5 10 TATTTG TTC ATC ATT TCT CGT GTT CAA GGA CAG AAT CTG GAT AGT ATG 396 Tyr LeuPhe Ile Ile Ser Arg Val Gln Gly Gln Asn Leu Asp Ser Met 15 20 25 CTT CATGGC ACT GGG ATG AAA TCA GAC TCC GAC CAG AAA AAG TCA GAA 444 Leu His GlyThr Gly Met Lys Ser Asp Ser Asp Gln Lys Lys Ser Glu 30 35 40 45 AAT GGAGTA ACC TTA GCA CCA GAG GAT ACC TTG CCT TTT TTA AAG TGC 492 Asn Gly ValThr Leu Ala Pro Glu Asp Thr Leu Pro Phe Leu Lys Cys 50 55 60 TAT TGC TCAGGG CAC TGT CCA GAT GAT GCT ATT AAT AAC ACA TGC ATA 540 Tyr Cys Ser GlyHis Cys Pro Asp Asp Ala Ile Asn Asn Thr Cys Ile 65 70 75 ACT AAT GGA CATTGC TTT GCC ATC ATA GAA GAA GAT GAC CAG GGA GAA 588 Thr Asn Gly His CysPhe Ala Ile Ile Glu Glu Asp Asp Gln Gly Glu 80 85 90 ACC ACA TTA GCT TCAGGG TGT ATG AAA TAT GAA GGA TCT GAT TTT CAG 636 Thr Thr Leu Ala Ser GlyCys Met Lys Tyr Glu Gly Ser Asp Phe Gln 95 100 105 TGC AAA GAT TCT CCAAAA GCC CAG CTA CGC CGG ACA ATA GAA TGT TGT 684 Cys Lys Asp Ser Pro LysAla Gln Leu Arg Arg Thr Ile Glu Cys Cys 110 115 120 125 CGG ACC AAT TTATGT AAC CAG TAT TTG CAA CCC ACA CTG CCC CCT GTT 732 Arg Thr Asn Leu CysAsn Gln Tyr Leu Gln Pro Thr Leu Pro Pro Val 130 135 140 GTC ATA GGT CCGTTT TTT GAT GGC AGC ATT CGA TGG CTG GTT TTG CTC 780 Val Ile Gly Pro PhePhe Asp Gly Ser Ile Arg Trp Leu Val Leu Leu 145 150 155 ATT TCT ATG GCTGTC TGC ATA ATT GCT ATG ATC ATC TTC TCC AGC TGC 828 Ile Ser Met Ala ValCys Ile Ile Ala Met Ile Ile Phe Ser Ser Cys 160 165 170 TTT TGT TAC AAACAT TAT TGC AAG AGC ATC TCA AGC AGA CGT CGT TAC 876 Phe Cys Tyr Lys HisTyr Cys Lys Ser Ile Ser Ser Arg Arg Arg Tyr 175 180 185 AAT CGT GAT TTGGAA CAG GAT GAA GCA TTT ATT CCA GTT GGA GAA TCA 924 Asn Arg Asp Leu GluGln Asp Glu Ala Phe Ile Pro Val Gly Glu Ser 190 195 200 205 CTA AAA GACCTT ATT GAC CAG TCA CAA AGT TCT GGT AGT GGG TCT GGA 972 Leu Lys Asp LeuIle Asp Gln Ser Gln Ser Ser Gly Ser Gly Ser Gly 210 215 220 CTA CCT TTATTG GTT CAG CGA ACT ATT GCC AAA CAG ATT CAG ATG GTC 1020 Leu Pro Leu LeuVal Gln Arg Thr Ile Ala Lys Gln Ile Gln Met Val 225 230 235 CGG CAA GTTGGT AAA GGC CGA TAT GGA GAA GTA TGG ATG GGC AAA TGG 1068 Arg Gln Val GlyLys Gly Arg Tyr Gly Glu Val Trp Met Gly Lys Trp 240 245 250 CGT GGC GAAAAA GTG GCG GTG AAA GTA TTC TTT ACC ACT GAA GAA GCC 1116 Arg Gly Glu LysVal Ala Val Lys Val Phe Phe Thr Thr Glu Glu Ala 255 260 265 AGC TGG TTTCGA GAA ACA GAA ATC TAC CAA ACT GTG CTA ATG CGC CAT 1164 Ser Trp Phe ArgGlu Thr Glu Ile Tyr Gln Thr Val Leu Met Arg His 270 275 280 285 GAA AACATA CTT GGT TTC ATA GCG GCA GAC ATT AAA GGT ACA GGT TCC 1212 Glu Asn IleLeu Gly Phe Ile Ala Ala Asp Ile Lys Gly Thr Gly Ser 290 295 300 TGG ACTCAG CTC TAT TTG ATT ACT GAT TAC CAT GAA AAT GGA TCT CTC 1260 Trp Thr GlnLeu Tyr Leu Ile Thr Asp Tyr His Glu Asn Gly Ser Leu 305 310 315 TAT GACTTC CTG AAA TGT GCT ACA CTG GAC ACC AGA GCC CTG CTT AAA 1308 Tyr Asp PheLeu Lys Cys Ala Thr Leu Asp Thr Arg Ala Leu Leu Lys 320 325 330 TTG GCTTAT TCA GCT GCC TGT GGT CTG TGC CAC CTG CAC ACA GAA ATT 1356 Leu Ala TyrSer Ala Ala Cys Gly Leu Cys His Leu His Thr Glu Ile 335 340 345 TAT GGCACC CAA GGA AAG CCC GCA ATT GCT CAT CGA GAC CTA AAG AGC 1404 Tyr Gly ThrGln Gly Lys Pro Ala Ile Ala His Arg Asp Leu Lys Ser 350 355 360 365 AAAAAC ATC CTC ATC AAG AAA AAT GGG AGT TGC TGC ATT GCT GAC CTG 1452 Lys AsnIle Leu Ile Lys Lys Asn Gly Ser Cys Cys Ile Ala Asp Leu 370 375 380 GGCCTT GCT GTT AAA TTC AAC AGT GAC ACA AAT GAA GTT GAT GTG CCC 1500 Gly LeuAla Val Lys Phe Asn Ser Asp Thr Asn Glu Val Asp Val Pro 385 390 395 TTGAAT ACC AGG GTG GGC ACC AAA CGC TAC ATG GCT CCC GAA GTG CTG 1548 Leu AsnThr Arg Val Gly Thr Lys Arg Tyr Met Ala Pro Glu Val Leu 400 405 410 GACGAA AGC CTG AAC AAA AAC CAC TTC CAG CCC TAC ATC ATG GCT GAC 1596 Asp GluSer Leu Asn Lys Asn His Phe Gln Pro Tyr Ile Met Ala Asp 415 420 425 ATCTAC AGC TTC GGC CTA ATC ATT TGG GAG ATG GCT CGT CGT TGT ATC 1644 Ile TyrSer Phe Gly Leu Ile Ile Trp Glu Met Ala Arg Arg Cys Ile 430 435 440 445ACA GGA GGG ATC GTG GAA GAA TAC CAA TTG CCA TAT TAC AAC ATG GTA 1692 ThrGly Gly Ile Val Glu Glu Tyr Gln Leu Pro Tyr Tyr Asn Met Val 450 455 460CCG AGT GAT CCG TCA TAC GAA GAT ATG CGT GAG GTT GTG TGT GTC AAA 1740 ProSer Asp Pro Ser Tyr Glu Asp Met Arg Glu Val Val Cys Val Lys 465 470 475CGT TTG CGG CCA ATT GTG TCT AAT CGG TGG AAC AGT GAT GAA TGT CTA 1788 ArgLeu Arg Pro Ile Val Ser Asn Arg Trp Asn Ser Asp Glu Cys Leu 480 485 490CGA GCA GTT TTG AAG CTA ATG TCA GAA TGC TGG GCC CAC AAT CCA GCC 1836 ArgAla Val Leu Lys Leu Met Ser Glu Cys Trp Ala His Asn Pro Ala 495 500 505TCC AGA CTC ACA GCA TTG AGA ATT AAG AAG ACG CTT GCC AAG ATG GTT 1884 SerArg Leu Thr Ala Leu Arg Ile Lys Lys Thr Leu Ala Lys Met Val 510 515 520525 GAA TCC CAA GAT GTA AAA ATC TGATGGTTAA ACCATCGGAG GAGAAACTCT 1935Glu Ser Gln Asp Val Lys Ile 530 AGACTGCAAG AACTGTTTTT ACCCATGGCATGGGTGGAAT TAGAGTGGAA TAAGGATGTT 1995 AACTTGGTTC TCAGACTCTT TCTTCACTACGTGTTCACAG GCTGCTAATA TTAAACCTTT 2055 CAGTACTCTT ATTAGGATAC AAGCTGGGAACTTCTAAACA CTTCATTCTT TATATATGGA 2115 CAGCTTTATT TTAAATGTGG TTTTTGATGCCTTTTTTTAA GTGGGTTTTT ATGAACTGCA 2175 TCAAGACTTC AATCCTGATT AGTGTCTCCAGTCAAGCTCT GGGTACTGAA TTGCCTGTTC 2235 ATAAAACGGT GCTTTCTGTG AAAGCCTTAAGAAGATAAAT GAGCGCAGCA GAGATGGAGA 2295 AATAGACTTT GCCTTTTACC TGAGACATTCAGTTCGTTTG TATTCTACCT TTGTAAAACA 2355 GCCTATAGAT GATGATGTGT TTGGGATACTGCTTATTTTA TGATAGTTTG TCCTGTGTCC 2415 TTAGTGATGT GTGTGTGTCT CCATGCACATGCACGCCGGG ATTCCTCTGC TGCCATTTGA 2475 ATTAGAAGAA AATAATTTAT ATGCATGCACAGGAAGATAT TGGTGGCCGG TGGTTTTGTG 2535 CTTTAAAAAT GCAATATCTG ACCAAGATTCGCCAATCTCA TACAAGCCAT TTACTTTGCA 2595 AGTGAGATAG CTTCCCCACC AGCTTTATTTTTTAACATGA AAGCTGATGC CAAGGCCAAA 2655 AGAAGTTTAA AGCATCTGTA AATTTGGACTGTTTTCCTTC AACCACCATT TTTTTTGTGG 2715 TTATTATTTT TGTCACGGAA AGCATCCTCTCCAAAGTTGG AGCTTCTATT GCCATGAACC 2775 ATGCTTACAA AGAAAGCACT TCTTATTGAAGTGAATTCCT GCATTTGATA GCAATGTAAG 2835 TGCCTATAAC CATGTTCTAT ATTCTTTATTCTCAGTAACT TTTAAAAGGG AAGTTATTTA 2895 TATTTTGTGT ATAATGTGCT TTATTTGCAAATCACCC 2932 532 amino acids amino acid linear protein not provided 6Met Thr Gln Leu Tyr Ile Tyr Ile Arg Leu Leu Gly Ala Tyr Leu Phe 1 5 1015 Ile Ile Ser Arg Val Gln Gly Gln Asn Leu Asp Ser Met Leu His Gly 20 2530 Thr Gly Met Lys Ser Asp Ser Asp Gln Lys Lys Ser Glu Asn Gly Val 35 4045 Thr Leu Ala Pro Glu Asp Thr Leu Pro Phe Leu Lys Cys Tyr Cys Ser 50 5560 Gly His Cys Pro Asp Asp Ala Ile Asn Asn Thr Cys Ile Thr Asn Gly 65 7075 80 His Cys Phe Ala Ile Ile Glu Glu Asp Asp Gln Gly Glu Thr Thr Leu 8590 95 Ala Ser Gly Cys Met Lys Tyr Glu Gly Ser Asp Phe Gln Cys Lys Asp100 105 110 Ser Pro Lys Ala Gln Leu Arg Arg Thr Ile Glu Cys Cys Arg ThrAsn 115 120 125 Leu Cys Asn Gln Tyr Leu Gln Pro Thr Leu Pro Pro Val ValIle Gly 130 135 140 Pro Phe Phe Asp Gly Ser Ile Arg Trp Leu Val Leu LeuIle Ser Met 145 150 155 160 Ala Val Cys Ile Ile Ala Met Ile Ile Phe SerSer Cys Phe Cys Tyr 165 170 175 Lys His Tyr Cys Lys Ser Ile Ser Ser ArgArg Arg Tyr Asn Arg Asp 180 185 190 Leu Glu Gln Asp Glu Ala Phe Ile ProVal Gly Glu Ser Leu Lys Asp 195 200 205 Leu Ile Asp Gln Ser Gln Ser SerGly Ser Gly Ser Gly Leu Pro Leu 210 215 220 Leu Val Gln Arg Thr Ile AlaLys Gln Ile Gln Met Val Arg Gln Val 225 230 235 240 Gly Lys Gly Arg TyrGly Glu Val Trp Met Gly Lys Trp Arg Gly Glu 245 250 255 Lys Val Ala ValLys Val Phe Phe Thr Thr Glu Glu Ala Ser Trp Phe 260 265 270 Arg Glu ThrGlu Ile Tyr Gln Thr Val Leu Met Arg His Glu Asn Ile 275 280 285 Leu GlyPhe Ile Ala Ala Asp Ile Lys Gly Thr Gly Ser Trp Thr Gln 290 295 300 LeuTyr Leu Ile Thr Asp Tyr His Glu Asn Gly Ser Leu Tyr Asp Phe 305 310 315320 Leu Lys Cys Ala Thr Leu Asp Thr Arg Ala Leu Leu Lys Leu Ala Tyr 325330 335 Ser Ala Ala Cys Gly Leu Cys His Leu His Thr Glu Ile Tyr Gly Thr340 345 350 Gln Gly Lys Pro Ala Ile Ala His Arg Asp Leu Lys Ser Lys AsnIle 355 360 365 Leu Ile Lys Lys Asn Gly Ser Cys Cys Ile Ala Asp Leu GlyLeu Ala 370 375 380 Val Lys Phe Asn Ser Asp Thr Asn Glu Val Asp Val ProLeu Asn Thr 385 390 395 400 Arg Val Gly Thr Lys Arg Tyr Met Ala Pro GluVal Leu Asp Glu Ser 405 410 415 Leu Asn Lys Asn His Phe Gln Pro Tyr IleMet Ala Asp Ile Tyr Ser 420 425 430 Phe Gly Leu Ile Ile Trp Glu Met AlaArg Arg Cys Ile Thr Gly Gly 435 440 445 Ile Val Glu Glu Tyr Gln Leu ProTyr Tyr Asn Met Val Pro Ser Asp 450 455 460 Pro Ser Tyr Glu Asp Met ArgGlu Val Val Cys Val Lys Arg Leu Arg 465 470 475 480 Pro Ile Val Ser AsnArg Trp Asn Ser Asp Glu Cys Leu Arg Ala Val 485 490 495 Leu Lys Leu MetSer Glu Cys Trp Ala His Asn Pro Ala Ser Arg Leu 500 505 510 Thr Ala LeuArg Ile Lys Lys Thr Leu Ala Lys Met Val Glu Ser Gln 515 520 525 Asp ValLys Ile 530 2333 base pairs nucleic acid unknown linear cDNA NO NOinternal Homo sapiens CDS 1..1515 7 ATG GCG GAG TCG GCC GGA GCC TCC TCCTTC TTC CCC CTT GTT GTC CTC 48 Met Ala Glu Ser Ala Gly Ala Ser Ser PhePhe Pro Leu Val Val Leu 1 5 10 15 CTG CTC GCC GGC AGC GGC GGG TCC GGGCCC CGG GGG GTC CAG GCT CTG 96 Leu Leu Ala Gly Ser Gly Gly Ser Gly ProArg Gly Val Gln Ala Leu 20 25 30 CTG TGT GCG TGC ACC AGC TGC CTC CAG GCCAAC TAC ACG TGT GAG ACA 144 Leu Cys Ala Cys Thr Ser Cys Leu Gln Ala AsnTyr Thr Cys Glu Thr 35 40 45 GAT GGG GCC TGC ATG GTT TCC TTT TTC AAT CTGGAT GGG ATG GAG CAC 192 Asp Gly Ala Cys Met Val Ser Phe Phe Asn Leu AspGly Met Glu His 50 55 60 CAT GTG CGC ACC TGC ATC CCC AAA GTG GAG CTG GTCCCT GCC GGG AAG 240 His Val Arg Thr Cys Ile Pro Lys Val Glu Leu Val ProAla Gly Lys 65 70 75 80 CCC TTC TAC TGC CTG AGC TCG GAG GAC CTG CGC AACACC CAC TGC TGC 288 Pro Phe Tyr Cys Leu Ser Ser Glu Asp Leu Arg Asn ThrHis Cys Cys 85 90 95 TAC ACT GAC TAC TGC AAC AGG ATC GAC TTG AGG GTG CCCAGT GGT CAC 336 Tyr Thr Asp Tyr Cys Asn Arg Ile Asp Leu Arg Val Pro SerGly His 100 105 110 CTC AAG GAG CCT GAG CAC CCG TCC ATG TGG GGC CCG GTGGAG CTG GTA 384 Leu Lys Glu Pro Glu His Pro Ser Met Trp Gly Pro Val GluLeu Val 115 120 125 GGC ATC ATC GCC GGC CCG GTG TTC CTC CTG TTC CTC ATCATC ATC ATT 432 Gly Ile Ile Ala Gly Pro Val Phe Leu Leu Phe Leu Ile IleIle Ile 130 135 140 GTT TTC CTT GTC ATT AAC TAT CAT CAG CGT GTC TAT CACAAC CGC CAG 480 Val Phe Leu Val Ile Asn Tyr His Gln Arg Val Tyr His AsnArg Gln 145 150 155 160 AGA CTG GAC ATG GAA GAT CCC TCA TGT GAG ATG TGTCTC TCC AAA GAC 528 Arg Leu Asp Met Glu Asp Pro Ser Cys Glu Met Cys LeuSer Lys Asp 165 170 175 AAG ACG CTC CAG GAT CTT GTC TAC GAT CTC TCC ACCTCA GGG TCT GGC 576 Lys Thr Leu Gln Asp Leu Val Tyr Asp Leu Ser Thr SerGly Ser Gly 180 185 190 TCA GGG TTA CCC CTC TTT GTC CAG CGC ACA GTG GCCCGA ACC ATC GTT 624 Ser Gly Leu Pro Leu Phe Val Gln Arg Thr Val Ala ArgThr Ile Val 195 200 205 TTA CAA GAG ATT ATT GGC AAG GGT CGG TTT GGG GAAGTA TGG CGG GGC 672 Leu Gln Glu Ile Ile Gly Lys Gly Arg Phe Gly Glu ValTrp Arg Gly 210 215 220 CGC TGG AGG GGT GGT GAT GTG GCT GTG AAA ATA TTCTCT TCT CGT GAA 720 Arg Trp Arg Gly Gly Asp Val Ala Val Lys Ile Phe SerSer Arg Glu 225 230 235 240 GAA CGG TCT TGG TTC AGG GAA GCA GAG ATA TACCAG ACG GTC ATG CTG 768 Glu Arg Ser Trp Phe Arg Glu Ala Glu Ile Tyr GlnThr Val Met Leu 245 250 255 CGC CAT GAA AAC ATC CTT GGA TTT ATT GCT GCTGAC AAT AAA GAT AAT 816 Arg His Glu Asn Ile Leu Gly Phe Ile Ala Ala AspAsn Lys Asp Asn 260 265 270 GGC ACC TGG ACA CAG CTG TGG CTT GTT TCT GACTAT CAT GAG CAC GGG 864 Gly Thr Trp Thr Gln Leu Trp Leu Val Ser Asp TyrHis Glu His Gly 275 280 285 TCC CTG TTT GAT TAT CTG AAC CGG TAC ACA GTGACA ATT GAG GGG ATG 912 Ser Leu Phe Asp Tyr Leu Asn Arg Tyr Thr Val ThrIle Glu Gly Met 290 295 300 ATT AAG CTG GCC TTG TCT GCT GCT AGT GGG CTGGCA CAC CTG CAC ATG 960 Ile Lys Leu Ala Leu Ser Ala Ala Ser Gly Leu AlaHis Leu His Met 305 310 315 320 GAG ATC GTG GGC ACC CAA GGG AAG CCT GGAATT GCT CAT CGA GAC TTA 1008 Glu Ile Val Gly Thr Gln Gly Lys Pro Gly IleAla His Arg Asp Leu 325 330 335 AAG TCA AAG AAC ATT CTG GTG AAG AAA AATGGC ATG TGT GCC ATA GCA 1056 Lys Ser Lys Asn Ile Leu Val Lys Lys Asn GlyMet Cys Ala Ile Ala 340 345 350 GAC CTG GGC CTG GCT GTC CGT CAT GAT GCAGTC ACT GAC ACC ATT GAC 1104 Asp Leu Gly Leu Ala Val Arg His Asp Ala ValThr Asp Thr Ile Asp 355 360 365 ATT GCC CCG AAT CAG AGG GTG GGG ACC AAACGA TAC ATG GCC CCT GAA 1152 Ile Ala Pro Asn Gln Arg Val Gly Thr Lys ArgTyr Met Ala Pro Glu 370 375 380 GTA CTT GAT GAA ACC ATT AAT ATG AAA CACTTT GAC TCC TTT AAA TGT 1200 Val Leu Asp Glu Thr Ile Asn Met Lys His PheAsp Ser Phe Lys Cys 385 390 395 400 GCT GAT ATT TAT GCC CTC GGG CTT GTATAT TGG GAG ATT GCT CGA AGA 1248 Ala Asp Ile Tyr Ala Leu Gly Leu Val TyrTrp Glu Ile Ala Arg Arg 405 410 415 TGC AAT TCT GGA GGA GTC CAT GAA GAATAT CAG CTG CCA TAT TAC GAC 1296 Cys Asn Ser Gly Gly Val His Glu Glu TyrGln Leu Pro Tyr Tyr Asp 420 425 430 TTA GTG CCC TCT GAC CCT TCC ATT GAGGAA ATG CGA AAG GTT GTA TGT 1344 Leu Val Pro Ser Asp Pro Ser Ile Glu GluMet Arg Lys Val Val Cys 435 440 445 GAT CAG AAG CTG CGT CCC AAC ATC CCCAAC TGG TGG CAG AGT TAT GAG 1392 Asp Gln Lys Leu Arg Pro Asn Ile Pro AsnTrp Trp Gln Ser Tyr Glu 450 455 460 GCA CTG CGG GTG ATG GGG AAG ATG ATGCGA GAG TGT TGG TAT GCC AAC 1440 Ala Leu Arg Val Met Gly Lys Met Met ArgGlu Cys Trp Tyr Ala Asn 465 470 475 480 GGC GCA GCC CGC CTG ACG GCC CTGCGC ATC AAG AAG ACC CTC TCC CAG 1488 Gly Ala Ala Arg Leu Thr Ala Leu ArgIle Lys Lys Thr Leu Ser Gln 485 490 495 CTC AGC GTG CAG GAA GAC GTG AAGATC TAACTGCTCC CTCTCTCCAC 1535 Leu Ser Val Gln Glu Asp Val Lys Ile 500505 ACGGAGCTCC TGGCAGCGAG AACTACGCAC AGCTGCCGCG TTGAGCGTAC GATGGAGGCC1595 TACCTCTCGT TTCTGCCCAG CCCTCTGTGG CCAGGAGCCC TGGCCCGCAA GAGGGACAGA1655 GCCCGGGAGA GACTCGCTCA CTCCCATGTT GGGTTTGAGA CAGACACCTT TTCTATTTAC1715 CTCCTAATGG CATGGAGACT CTGAGAGCGA ATTGTGTGGA GAACTCAGTG CCACACCTCG1775 AACTGGTTGT AGTGGGAAGT CCCGCGAAAC CCGGTGCATC TGGCACGTGG CCAGGAGCCA1835 TGACAGGGGC GCTTGGGAGG GGCCGGAGGA ACCGAGGTGT TGCCAGTGCT AAGCTGCCCT1895 GAGGGTTTCC TTCGGGGACC AGCCCACAGC ACACCAAGGT GGCCCGGAAG AACCAGAAGT1955 GCAGCCCCTC TCACAGGCAG CTCTGAGCCG CGCTTTCCCC TCCTCCCTGG GATGGACGCT2015 GCCGGGAGAC TGCCAGTGGA GACGGAATCT GCCGCTTTGT CTGTCCAGCC GTGTGTGCAT2075 GTGCCGAGGT GCGTCCCCCG TTGTGCCTGG TTCGTGCCAT GCCCTTACAC GTGCGTGTGA2135 GTGTGTGTGT GTGTCTGTAG GTGCGCACTT ACCTGCTTGA GCTTTCTGTG CATGTGCAGG2195 TCGGGGGTGT GGTCGTCATG CTGTCCGTGC TTGCTGGTGC CTCTTTTCAG TAGTGAGCAG2255 CATCTAGTTT CCCTGGTGCC CTTCCCTGGA GGTCTCTCCC TCCCCCAGAG CCCCTCATGC2315 CACAGTGGTA CTCTGTGT 2333 505 amino acids amino acid linear proteinnot provided 8 Met Ala Glu Ser Ala Gly Ala Ser Ser Phe Phe Pro Leu ValVal Leu 1 5 10 15 Leu Leu Ala Gly Ser Gly Gly Ser Gly Pro Arg Gly ValGln Ala Leu 20 25 30 Leu Cys Ala Cys Thr Ser Cys Leu Gln Ala Asn Tyr ThrCys Glu Thr 35 40 45 Asp Gly Ala Cys Met Val Ser Phe Phe Asn Leu Asp GlyMet Glu His 50 55 60 His Val Arg Thr Cys Ile Pro Lys Val Glu Leu Val ProAla Gly Lys 65 70 75 80 Pro Phe Tyr Cys Leu Ser Ser Glu Asp Leu Arg AsnThr His Cys Cys 85 90 95 Tyr Thr Asp Tyr Cys Asn Arg Ile Asp Leu Arg ValPro Ser Gly His 100 105 110 Leu Lys Glu Pro Glu His Pro Ser Met Trp GlyPro Val Glu Leu Val 115 120 125 Gly Ile Ile Ala Gly Pro Val Phe Leu LeuPhe Leu Ile Ile Ile Ile 130 135 140 Val Phe Leu Val Ile Asn Tyr His GlnArg Val Tyr His Asn Arg Gln 145 150 155 160 Arg Leu Asp Met Glu Asp ProSer Cys Glu Met Cys Leu Ser Lys Asp 165 170 175 Lys Thr Leu Gln Asp LeuVal Tyr Asp Leu Ser Thr Ser Gly Ser Gly 180 185 190 Ser Gly Leu Pro LeuPhe Val Gln Arg Thr Val Ala Arg Thr Ile Val 195 200 205 Leu Gln Glu IleIle Gly Lys Gly Arg Phe Gly Glu Val Trp Arg Gly 210 215 220 Arg Trp ArgGly Gly Asp Val Ala Val Lys Ile Phe Ser Ser Arg Glu 225 230 235 240 GluArg Ser Trp Phe Arg Glu Ala Glu Ile Tyr Gln Thr Val Met Leu 245 250 255Arg His Glu Asn Ile Leu Gly Phe Ile Ala Ala Asp Asn Lys Asp Asn 260 265270 Gly Thr Trp Thr Gln Leu Trp Leu Val Ser Asp Tyr His Glu His Gly 275280 285 Ser Leu Phe Asp Tyr Leu Asn Arg Tyr Thr Val Thr Ile Glu Gly Met290 295 300 Ile Lys Leu Ala Leu Ser Ala Ala Ser Gly Leu Ala His Leu HisMet 305 310 315 320 Glu Ile Val Gly Thr Gln Gly Lys Pro Gly Ile Ala HisArg Asp Leu 325 330 335 Lys Ser Lys Asn Ile Leu Val Lys Lys Asn Gly MetCys Ala Ile Ala 340 345 350 Asp Leu Gly Leu Ala Val Arg His Asp Ala ValThr Asp Thr Ile Asp 355 360 365 Ile Ala Pro Asn Gln Arg Val Gly Thr LysArg Tyr Met Ala Pro Glu 370 375 380 Val Leu Asp Glu Thr Ile Asn Met LysHis Phe Asp Ser Phe Lys Cys 385 390 395 400 Ala Asp Ile Tyr Ala Leu GlyLeu Val Tyr Trp Glu Ile Ala Arg Arg 405 410 415 Cys Asn Ser Gly Gly ValHis Glu Glu Tyr Gln Leu Pro Tyr Tyr Asp 420 425 430 Leu Val Pro Ser AspPro Ser Ile Glu Glu Met Arg Lys Val Val Cys 435 440 445 Asp Gln Lys LeuArg Pro Asn Ile Pro Asn Trp Trp Gln Ser Tyr Glu 450 455 460 Ala Leu ArgVal Met Gly Lys Met Met Arg Glu Cys Trp Tyr Ala Asn 465 470 475 480 GlyAla Ala Arg Leu Thr Ala Leu Arg Ile Lys Lys Thr Leu Ser Gln 485 490 495Leu Ser Val Gln Glu Asp Val Lys Ile 500 505 2308 base pairs nucleic acidunknown linear cDNA NO NO internal Mouse CDS 77..1585 9 GGCGAGGCGAGGTTTGCTGG GGTGAGGCAG CGGCGCGGCC GGGCCGGGCC GGGCCACAGG 60 CGGTGGCGGCGGGACC ATG GAG GCG GCG GTC GCT GCT CCG CGT CCC CGG 109 Met Glu Ala AlaVal Ala Ala Pro Arg Pro Arg 1 5 10 CTG CTC CTC CTC GTG CTG GCG GCG GCGGCG GCG GCG GCG GCG GCG CTG 157 Leu Leu Leu Leu Val Leu Ala Ala Ala AlaAla Ala Ala Ala Ala Leu 15 20 25 CTC CCG GGG GCG ACG GCG TTA CAG TGT TTCTGC CAC CTC TGT ACA AAA 205 Leu Pro Gly Ala Thr Ala Leu Gln Cys Phe CysHis Leu Cys Thr Lys 30 35 40 GAC AAT TTT ACT TGT GTG ACA GAT GGG CTC TGCTTT GTC TCT GTC ACA 253 Asp Asn Phe Thr Cys Val Thr Asp Gly Leu Cys PheVal Ser Val Thr 45 50 55 GAG ACC ACA GAC AAA GTT ATA CAC AAC AGC ATG TGTATA GCT GAA ATT 301 Glu Thr Thr Asp Lys Val Ile His Asn Ser Met Cys IleAla Glu Ile 60 65 70 75 GAC TTA ATT CCT CGA GAT AGG CCG TTT GTA TGT GCACCC TCT TCA AAA 349 Asp Leu Ile Pro Arg Asp Arg Pro Phe Val Cys Ala ProSer Ser Lys 80 85 90 ACT GGG TCT GTG ACT ACA ACA TAT TGC TGC AAT CAG GACCAT TGC AAT 397 Thr Gly Ser Val Thr Thr Thr Tyr Cys Cys Asn Gln Asp HisCys Asn 95 100 105 AAA ATA GAA CTT CCA ACT ACT GTA AAG TCA TCA CCT GGCCTT GGT CCT 445 Lys Ile Glu Leu Pro Thr Thr Val Lys Ser Ser Pro Gly LeuGly Pro 110 115 120 GTG GAA CTG GCA GCT GTC ATT GCT GGA CCA GTG TGC TTCGTC TGC ATC 493 Val Glu Leu Ala Ala Val Ile Ala Gly Pro Val Cys Phe ValCys Ile 125 130 135 TCA CTC ATG TTG ATG GTC TAT ATC TGC CAC AAC CGC ACTGTC ATT CAC 541 Ser Leu Met Leu Met Val Tyr Ile Cys His Asn Arg Thr ValIle His 140 145 150 155 CAT CGA GTG CCA AAT GAA GAG GAC CCT TCA TTA GATCGC CCT TTT ATT 589 His Arg Val Pro Asn Glu Glu Asp Pro Ser Leu Asp ArgPro Phe Ile 160 165 170 TCA GAG GGT ACT ACG TTG AAA GAC TTA ATT TAT GATATG ACA ACG TCA 637 Ser Glu Gly Thr Thr Leu Lys Asp Leu Ile Tyr Asp MetThr Thr Ser 175 180 185 GGT TCT GGC TCA GGT TTA CCA TTG CTT GTT CAG AGAACA ATT GCG AGA 685 Gly Ser Gly Ser Gly Leu Pro Leu Leu Val Gln Arg ThrIle Ala Arg 190 195 200 ACT ATT GTG TTA CAA GAA AGC ATT GGC AAA GGT CGATTT GGA GAA GTT 733 Thr Ile Val Leu Gln Glu Ser Ile Gly Lys Gly Arg PheGly Glu Val 205 210 215 TGG AGA GGA AAG TGG CGG GGA GAA GAA GTT GCT GTTAAG ATA TTC TCC 781 Trp Arg Gly Lys Trp Arg Gly Glu Glu Val Ala Val LysIle Phe Ser 220 225 230 235 TCT AGA GAA GAA CGT TCG TGG TTC CGT GAG GCAGAG ATT TAT CAA ACT 829 Ser Arg Glu Glu Arg Ser Trp Phe Arg Glu Ala GluIle Tyr Gln Thr 240 245 250 GTA ATG TTA CGT CAT GAA AAC ATC CTG GGA TTTATA GCA GCA GAC AAT 877 Val Met Leu Arg His Glu Asn Ile Leu Gly Phe IleAla Ala Asp Asn 255 260 265 AAA GAC AAT GGT ACT TGG ACT CAG CTC TGG TTGGTG TCA GAT TAT CAT 925 Lys Asp Asn Gly Thr Trp Thr Gln Leu Trp Leu ValSer Asp Tyr His 270 275 280 GAG CAT GGA TCC CTT TTT GAT TAC TTA AAC AGATAC ACA GTT ACT GTG 973 Glu His Gly Ser Leu Phe Asp Tyr Leu Asn Arg TyrThr Val Thr Val 285 290 295 GAA GGA ATG ATA AAA CTT GCT CTG TCC ACG GCGAGC GGT CTT GCC CAT 1021 Glu Gly Met Ile Lys Leu Ala Leu Ser Thr Ala SerGly Leu Ala His 300 305 310 315 CTT CAC ATG GAG ATT GTT GGT ACC CAA GGAAAG CCA GCC ATT GCT CAT 1069 Leu His Met Glu Ile Val Gly Thr Gln Gly LysPro Ala Ile Ala His 320 325 330 AGA GAT TTG AAA TCA AAG AAT ATC TTG GTAAAG AAG AAT GGA ACT TGC 1117 Arg Asp Leu Lys Ser Lys Asn Ile Leu Val LysLys Asn Gly Thr Cys 335 340 345 TGT ATT GCA GAC TTA GGA CTG GCA GTA AGACAT GAT TCA GCC ACA GAT 1165 Cys Ile Ala Asp Leu Gly Leu Ala Val Arg HisAsp Ser Ala Thr Asp 350 355 360 ACC ATT GAT ATT GCT CCA AAC CAC AGA GTGGGA ACA AAA AGG TAC ATG 1213 Thr Ile Asp Ile Ala Pro Asn His Arg Val GlyThr Lys Arg Tyr Met 365 370 375 GCC CCT GAA GTT CTC GAT GAT TCC ATA AATATG AAA CAT TTT GAA TCC 1261 Ala Pro Glu Val Leu Asp Asp Ser Ile Asn MetLys His Phe Glu Ser 380 385 390 395 TTC AAA CGT GCT GAC ATC TAT GCA ATGGGC TTA GTA TTC TGG GAA ATT 1309 Phe Lys Arg Ala Asp Ile Tyr Ala Met GlyLeu Val Phe Trp Glu Ile 400 405 410 GCT CGA CGA TGT TCC ATT GGT GGA ATTCAT GAA GAT TAC CAA CTG CCT 1357 Ala Arg Arg Cys Ser Ile Gly Gly Ile HisGlu Asp Tyr Gln Leu Pro 415 420 425 TAT TAT GAT CTT GTA CCT TCT GAC CCATCA GTT GAA GAA ATG AGA AAA 1405 Tyr Tyr Asp Leu Val Pro Ser Asp Pro SerVal Glu Glu Met Arg Lys 430 435 440 GTT GTT TGT GAA CAG AAG TTA AGG CCAAAT ATC CCA AAC AGA TGG CAG 1453 Val Val Cys Glu Gln Lys Leu Arg Pro AsnIle Pro Asn Arg Trp Gln 445 450 455 AGC TGT GAA GCC TTG AGA GTA ATG GCTAAA ATT ATG AGA GAA TGT TGG 1501 Ser Cys Glu Ala Leu Arg Val Met Ala LysIle Met Arg Glu Cys Trp 460 465 470 475 TAT GCC AAT GGA GCA GCT AGG CTTACA GCA TTG CGG ATT AAG AAA ACA 1549 Tyr Ala Asn Gly Ala Ala Arg Leu ThrAla Leu Arg Ile Lys Lys Thr 480 485 490 TTA TCG CAA CTC AGT CAA CAG GAAGGC ATC AAA ATG TAATTCTACA 1595 Leu Ser Gln Leu Ser Gln Gln Glu Gly IleLys Met 495 500 GCTTTGCCTG AACTCTCCTT TTTTCTTCAG ATCTGCTCCT GGGTTTTAATTTGGGAGGTC 1655 AGTTGTTCTA CCTCACTGAG AGGGAACAGA AGGATATTGC TTCCTTTTGCAGCAGTGTAA 1715 TAAAGTCAAT TAAAAACTTC CCAGGATTTC TTTGGACCCA GGAAACAGCCATGTGGGTCC 1775 TTTCTGTGCA CTATGAACGC TTCTTTCCCA GGACAGAAAA TGTGTAGTCTACCTTTATTT 1835 TTTATTAACA AAACTTGTTT TTTAAAAAGA TGATTGCTGG TCTTAACTTTAGGTAACTCT 1895 GCTGTGCTGG AGATCATCTT TAAGGGCAAA GGAGTTGGAT TGCTGAATTACAATGAAACA 1955 TGTCTTATTA CTAAAGAAAG TGATTTACTC CTGGTTAGTA CATTCTCAGAGGATTCTGAA 2015 CCACTAGAGT TTCCTTGATT CAGACTTTGA ATGTACTGTT CTATAGTTTTTCAGGATCTT 2075 AAAACTAACA CTTATAAAAC TCTTATCTTG AGTCTAAAAA TGACCTCATATAGTAGTGAG 2135 GAACATAATT CATGCAATTG TATTTTGTAT ACTATTATTG TTCTTTCACTTATTCAGAAC 2195 ATTACATGCC TTCAAAATGG GATTGTACTA TACCAGTAAG TGCCACTTCTGTGTCTTTCT 2255 AATGGAAATG AGTAGAATTG CTGAAAGTCT CTATGTTAAA ACCTATAGTGTTT 2308 503 amino acids amino acid linear protein not provided 10 MetGlu Ala Ala Val Ala Ala Pro Arg Pro Arg Leu Leu Leu Leu Val 1 5 10 15Leu Ala Ala Ala Ala Ala Ala Ala Ala Ala Leu Leu Pro Gly Ala Thr 20 25 30Ala Leu Gln Cys Phe Cys His Leu Cys Thr Lys Asp Asn Phe Thr Cys 35 40 45Val Thr Asp Gly Leu Cys Phe Val Ser Val Thr Glu Thr Thr Asp Lys 50 55 60Val Ile His Asn Ser Met Cys Ile Ala Glu Ile Asp Leu Ile Pro Arg 65 70 7580 Asp Arg Pro Phe Val Cys Ala Pro Ser Ser Lys Thr Gly Ser Val Thr 85 9095 Thr Thr Tyr Cys Cys Asn Gln Asp His Cys Asn Lys Ile Glu Leu Pro 100105 110 Thr Thr Val Lys Ser Ser Pro Gly Leu Gly Pro Val Glu Leu Ala Ala115 120 125 Val Ile Ala Gly Pro Val Cys Phe Val Cys Ile Ser Leu Met LeuMet 130 135 140 Val Tyr Ile Cys His Asn Arg Thr Val Ile His His Arg ValPro Asn 145 150 155 160 Glu Glu Asp Pro Ser Leu Asp Arg Pro Phe Ile SerGlu Gly Thr Thr 165 170 175 Leu Lys Asp Leu Ile Tyr Asp Met Thr Thr SerGly Ser Gly Ser Gly 180 185 190 Leu Pro Leu Leu Val Gln Arg Thr Ile AlaArg Thr Ile Val Leu Gln 195 200 205 Glu Ser Ile Gly Lys Gly Arg Phe GlyGlu Val Trp Arg Gly Lys Trp 210 215 220 Arg Gly Glu Glu Val Ala Val LysIle Phe Ser Ser Arg Glu Glu Arg 225 230 235 240 Ser Trp Phe Arg Glu AlaGlu Ile Tyr Gln Thr Val Met Leu Arg His 245 250 255 Glu Asn Ile Leu GlyPhe Ile Ala Ala Asp Asn Lys Asp Asn Gly Thr 260 265 270 Trp Thr Gln LeuTrp Leu Val Ser Asp Tyr His Glu His Gly Ser Leu 275 280 285 Phe Asp TyrLeu Asn Arg Tyr Thr Val Thr Val Glu Gly Met Ile Lys 290 295 300 Leu AlaLeu Ser Thr Ala Ser Gly Leu Ala His Leu His Met Glu Ile 305 310 315 320Val Gly Thr Gln Gly Lys Pro Ala Ile Ala His Arg Asp Leu Lys Ser 325 330335 Lys Asn Ile Leu Val Lys Lys Asn Gly Thr Cys Cys Ile Ala Asp Leu 340345 350 Gly Leu Ala Val Arg His Asp Ser Ala Thr Asp Thr Ile Asp Ile Ala355 360 365 Pro Asn His Arg Val Gly Thr Lys Arg Tyr Met Ala Pro Glu ValLeu 370 375 380 Asp Asp Ser Ile Asn Met Lys His Phe Glu Ser Phe Lys ArgAla Asp 385 390 395 400 Ile Tyr Ala Met Gly Leu Val Phe Trp Glu Ile AlaArg Arg Cys Ser 405 410 415 Ile Gly Gly Ile His Glu Asp Tyr Gln Leu ProTyr Tyr Asp Leu Val 420 425 430 Pro Ser Asp Pro Ser Val Glu Glu Met ArgLys Val Val Cys Glu Gln 435 440 445 Lys Leu Arg Pro Asn Ile Pro Asn ArgTrp Gln Ser Cys Glu Ala Leu 450 455 460 Arg Val Met Ala Lys Ile Met ArgGlu Cys Trp Tyr Ala Asn Gly Ala 465 470 475 480 Ala Arg Leu Thr Ala LeuArg Ile Lys Lys Thr Leu Ser Gln Leu Ser 485 490 495 Gln Gln Glu Gly IleLys Met 500 1922 base pairs nucleic acid unknown linear cDNA NO NOinternal Mouse CDS 241..1746 11 GAGAGCACAG CCCTTCCCAG TCCCCGGAGCCGCCGCGCCA CGCGCGCATG ATCAAGACCT 60 TTTCCCCGGC CCCACAGGGC CTCTGGACGTGAGACCCCGG CCGCCTCCGC AAGGAGAGGC 120 GGGGGTCGAG TCGCCCTGTC CAAAGGCCTCAATCTAAACA ATCTTGATTC CTGTTGCCGG 180 CTGGCGGGAC CCTGAATGGC AGGAAATCTCACCACATCTC TTCTCCTATC TCCAAGGACC 240 ATG ACC TTG GGG AGC TTC AGA AGG GGCCTT TTG ATG CTG TCG GTG GCC 288 Met Thr Leu Gly Ser Phe Arg Arg Gly LeuLeu Met Leu Ser Val Ala 1 5 10 15 TTG GGC CTA ACC CAG GGG AGA CTT GCGAAG CCT TCC AAG CTG GTG AAC 336 Leu Gly Leu Thr Gln Gly Arg Leu Ala LysPro Ser Lys Leu Val Asn 20 25 30 TGC ACT TGT GAG AGC CCA CAC TGC AAG AGACCA TTC TGC CAG GGG TCA 384 Cys Thr Cys Glu Ser Pro His Cys Lys Arg ProPhe Cys Gln Gly Ser 35 40 45 TGG TGC ACA GTG GTG CTG GTT CGA GAG CAG GGCAGG CAC CCC CAG GTC 432 Trp Cys Thr Val Val Leu Val Arg Glu Gln Gly ArgHis Pro Gln Val 50 55 60 TAT CGG GGC TGT GGG AGC CTG AAC CAG GAG CTC TGCTTG GGA CGT CCC 480 Tyr Arg Gly Cys Gly Ser Leu Asn Gln Glu Leu Cys LeuGly Arg Pro 65 70 75 80 ACG GAG TTT CTG AAC CAT CAC TGC TGC TAT AGA TCCTTC TGC AAC CAC 528 Thr Glu Phe Leu Asn His His Cys Cys Tyr Arg Ser PheCys Asn His 85 90 95 AAC GTG TCT CTG ATG CTG GAG GCC ACC CAA ACT CCT TCGGAG GAG CCA 576 Asn Val Ser Leu Met Leu Glu Ala Thr Gln Thr Pro Ser GluGlu Pro 100 105 110 GAA GTT GAT GCC CAT CTG CCT CTG ATC CTG GGT CCT GTGCTG GCC TTG 624 Glu Val Asp Ala His Leu Pro Leu Ile Leu Gly Pro Val LeuAla Leu 115 120 125 CCG GTC CTG GTG GCC CTG GGT GCT CTG GGC TTG TGG CGTGTC CGG CGG 672 Pro Val Leu Val Ala Leu Gly Ala Leu Gly Leu Trp Arg ValArg Arg 130 135 140 AGG CAG GAG AAG CAG CGG GAT TTG CAC AGT GAC CTG GGCGAG TCC AGT 720 Arg Gln Glu Lys Gln Arg Asp Leu His Ser Asp Leu Gly GluSer Ser 145 150 155 160 CTC ATC CTG AAG GCA TCT GAA CAG GCA GAC AGC ATGTTG GGG GAC TTC 768 Leu Ile Leu Lys Ala Ser Glu Gln Ala Asp Ser Met LeuGly Asp Phe 165 170 175 CTG GAC AGC GAC TGT ACC ACG GGC AGC GGC TCG GGGCTC CCC TTC TTG 816 Leu Asp Ser Asp Cys Thr Thr Gly Ser Gly Ser Gly LeuPro Phe Leu 180 185 190 GTG CAG AGG ACG GTA GCT CGG CAG GTT GCG CTG GTAGAG TGT GTG GGA 864 Val Gln Arg Thr Val Ala Arg Gln Val Ala Leu Val GluCys Val Gly 195 200 205 AAG GGC CGA TAT GGC GAG GTG TGG CGC GGT TCG TGGCAT GGC GAA AGC 912 Lys Gly Arg Tyr Gly Glu Val Trp Arg Gly Ser Trp HisGly Glu Ser 210 215 220 GTG GCG GTC AAG ATT TTC TCC TCA CGA GAT GAG CAGTCC TGG TTC CGG 960 Val Ala Val Lys Ile Phe Ser Ser Arg Asp Glu Gln SerTrp Phe Arg 225 230 235 240 GAG ACG GAG ATC TAC AAC ACA GTT CTG CTT AGACAC GAC AAC ATC CTA 1008 Glu Thr Glu Ile Tyr Asn Thr Val Leu Leu Arg HisAsp Asn Ile Leu 245 250 255 GGC TTC ATC GCC TCC GAC ATG ACT TCG CGG AACTCG AGC ACG CAG CTG 1056 Gly Phe Ile Ala Ser Asp Met Thr Ser Arg Asn SerSer Thr Gln Leu 260 265 270 TGG CTC ATC ACC CAC TAC CAT GAA CAC GGC TCCCTC TAT GAC TTT CTG 1104 Trp Leu Ile Thr His Tyr His Glu His Gly Ser LeuTyr Asp Phe Leu 275 280 285 CAG AGG CAG ACG CTG GAG CCC CAG TTG GCC CTGAGG CTA GCT GTG TCC 1152 Gln Arg Gln Thr Leu Glu Pro Gln Leu Ala Leu ArgLeu Ala Val Ser 290 295 300 CCG GCC TGC GGC CTG GCG CAC CTA CAT GTG GAGATC TTT GGC ACT CAA 1200 Pro Ala Cys Gly Leu Ala His Leu His Val Glu IlePhe Gly Thr Gln 305 310 315 320 GGC AAA CCA GCC ATT GCC CAT CGT GAC CTCAAG AGT CGC AAT GTG CTG 1248 Gly Lys Pro Ala Ile Ala His Arg Asp Leu LysSer Arg Asn Val Leu 325 330 335 GTC AAG AGT AAC TTG CAG TGT TGC ATT GCAGAC CTG GGA CTG GCT GTG 1296 Val Lys Ser Asn Leu Gln Cys Cys Ile Ala AspLeu Gly Leu Ala Val 340 345 350 ATG CAC TCA CAA AGC AAC GAG TAC CTG GATATC GGC AAC ACA CCC CGA 1344 Met His Ser Gln Ser Asn Glu Tyr Leu Asp IleGly Asn Thr Pro Arg 355 360 365 GTG GGT ACC AAA AGA TAC ATG GCA CCC GAGGTG CTG GAT GAG CAC ATC 1392 Val Gly Thr Lys Arg Tyr Met Ala Pro Glu ValLeu Asp Glu His Ile 370 375 380 CGC ACA GAC TGC TTT GAG TCG TAC AAG TGGACA GAC ATC TGG GCC TTT 1440 Arg Thr Asp Cys Phe Glu Ser Tyr Lys Trp ThrAsp Ile Trp Ala Phe 385 390 395 400 GGC CTA GTG CTA TGG GAG ATC GCC CGGCGG ACC ATC ATC AAT GGC ATT 1488 Gly Leu Val Leu Trp Glu Ile Ala Arg ArgThr Ile Ile Asn Gly Ile 405 410 415 GTG GAG GAT TAC AGG CCA CCT TTC TATGAC ATG GTA CCC AAT GAC CCC 1536 Val Glu Asp Tyr Arg Pro Pro Phe Tyr AspMet Val Pro Asn Asp Pro 420 425 430 AGT TTT GAG GAC ATG AAA AAG GTG GTGTGC GTT GAC CAG CAG ACA CCC 1584 Ser Phe Glu Asp Met Lys Lys Val Val CysVal Asp Gln Gln Thr Pro 435 440 445 ACC ATC CCT AAC CGG CTG GCT GCA GATCCG GTC CTC TCC GGG CTG GCC 1632 Thr Ile Pro Asn Arg Leu Ala Ala Asp ProVal Leu Ser Gly Leu Ala 450 455 460 CAG ATG ATG AGA GAG TGC TGG TAC CCCAAC CCC TCT GCT CGC CTC ACC 1680 Gln Met Met Arg Glu Cys Trp Tyr Pro AsnPro Ser Ala Arg Leu Thr 465 470 475 480 GCA CTG CGC ATA AAG AAG ACA TTGCAG AAG CTC AGT CAC AAT CCA GAG 1728 Ala Leu Arg Ile Lys Lys Thr Leu GlnLys Leu Ser His Asn Pro Glu 485 490 495 AAG CCC AAA GTG ATT CACTAGCCCAGGG CCACCAGGCT TCCTCTGCCT 1776 Lys Pro Lys Val Ile His 500AAAGTGTGTG CTGGGGAAGA AGACATAGCC TGTCTGGGTA GAGGGAGTGA AGAGAGTGTG 1836CACGCTGCCC TGTGTGTGCC TGCTCAGCTT GCTCCCAGCC CATCCAGCCA AAAATACAGC 1896TGAGCTGAAA TTCAAAAAAA AAAAAA 1922 502 amino acids amino acid linearprotein not provided 12 Met Thr Leu Gly Ser Phe Arg Arg Gly Leu Leu MetLeu Ser Val Ala 1 5 10 15 Leu Gly Leu Thr Gln Gly Arg Leu Ala Lys ProSer Lys Leu Val Asn 20 25 30 Cys Thr Cys Glu Ser Pro His Cys Lys Arg ProPhe Cys Gln Gly Ser 35 40 45 Trp Cys Thr Val Val Leu Val Arg Glu Gln GlyArg His Pro Gln Val 50 55 60 Tyr Arg Gly Cys Gly Ser Leu Asn Gln Glu LeuCys Leu Gly Arg Pro 65 70 75 80 Thr Glu Phe Leu Asn His His Cys Cys TyrArg Ser Phe Cys Asn His 85 90 95 Asn Val Ser Leu Met Leu Glu Ala Thr GlnThr Pro Ser Glu Glu Pro 100 105 110 Glu Val Asp Ala His Leu Pro Leu IleLeu Gly Pro Val Leu Ala Leu 115 120 125 Pro Val Leu Val Ala Leu Gly AlaLeu Gly Leu Trp Arg Val Arg Arg 130 135 140 Arg Gln Glu Lys Gln Arg AspLeu His Ser Asp Leu Gly Glu Ser Ser 145 150 155 160 Leu Ile Leu Lys AlaSer Glu Gln Ala Asp Ser Met Leu Gly Asp Phe 165 170 175 Leu Asp Ser AspCys Thr Thr Gly Ser Gly Ser Gly Leu Pro Phe Leu 180 185 190 Val Gln ArgThr Val Ala Arg Gln Val Ala Leu Val Glu Cys Val Gly 195 200 205 Lys GlyArg Tyr Gly Glu Val Trp Arg Gly Ser Trp His Gly Glu Ser 210 215 220 ValAla Val Lys Ile Phe Ser Ser Arg Asp Glu Gln Ser Trp Phe Arg 225 230 235240 Glu Thr Glu Ile Tyr Asn Thr Val Leu Leu Arg His Asp Asn Ile Leu 245250 255 Gly Phe Ile Ala Ser Asp Met Thr Ser Arg Asn Ser Ser Thr Gln Leu260 265 270 Trp Leu Ile Thr His Tyr His Glu His Gly Ser Leu Tyr Asp PheLeu 275 280 285 Gln Arg Gln Thr Leu Glu Pro Gln Leu Ala Leu Arg Leu AlaVal Ser 290 295 300 Pro Ala Cys Gly Leu Ala His Leu His Val Glu Ile PheGly Thr Gln 305 310 315 320 Gly Lys Pro Ala Ile Ala His Arg Asp Leu LysSer Arg Asn Val Leu 325 330 335 Val Lys Ser Asn Leu Gln Cys Cys Ile AlaAsp Leu Gly Leu Ala Val 340 345 350 Met His Ser Gln Ser Asn Glu Tyr LeuAsp Ile Gly Asn Thr Pro Arg 355 360 365 Val Gly Thr Lys Arg Tyr Met AlaPro Glu Val Leu Asp Glu His Ile 370 375 380 Arg Thr Asp Cys Phe Glu SerTyr Lys Trp Thr Asp Ile Trp Ala Phe 385 390 395 400 Gly Leu Val Leu TrpGlu Ile Ala Arg Arg Thr Ile Ile Asn Gly Ile 405 410 415 Val Glu Asp TyrArg Pro Pro Phe Tyr Asp Met Val Pro Asn Asp Pro 420 425 430 Ser Phe GluAsp Met Lys Lys Val Val Cys Val Asp Gln Gln Thr Pro 435 440 445 Thr IlePro Asn Arg Leu Ala Ala Asp Pro Val Leu Ser Gly Leu Ala 450 455 460 GlnMet Met Arg Glu Cys Trp Tyr Pro Asn Pro Ser Ala Arg Leu Thr 465 470 475480 Ala Leu Arg Ile Lys Lys Thr Leu Gln Lys Leu Ser His Asn Pro Glu 485490 495 Lys Pro Lys Val Ile His 500 2070 base pairs nucleic acid unknownlinear cDNA NO NO internal Mouse CDS 217..1812 13 ATTCATGAGA TGGAAGCATAGGTCAAAGCT GTTCGGAGAA ATTGGAACTA CAGTTTTATC 60 TAGCCACATC TCTGAGAATTCTGAAGAAAG CAGCAGGTGA AAGTCATTGC CAAGTGATTT 120 TGTTCTGTAA GGAAGCCTCCCTCATTCACT TACACCAGTG AGACAGCAGG ACCAGTCATT 180 CAAAGGGCCG TGTACAGGACGCGTGGCAAT CAGACA ATG ACT CAG CTA TAC ACT 234 Met Thr Gln Leu Tyr Thr 15 TAC ATC AGA TTA CTG GGA GCC TGT CTG TTC ATC ATT TCT CAT GTT CAA 282Tyr Ile Arg Leu Leu Gly Ala Cys Leu Phe Ile Ile Ser His Val Gln 10 15 20GGG CAG AAT CTA GAT AGT ATG CTC CAT GGC ACT GGT ATG AAA TCA GAC 330 GlyGln Asn Leu Asp Ser Met Leu His Gly Thr Gly Met Lys Ser Asp 25 30 35 TTGGAC CAG AAG AAG CCA GAA AAT GGA GTG ACT TTA GCA CCA GAG GAT 378 Leu AspGln Lys Lys Pro Glu Asn Gly Val Thr Leu Ala Pro Glu Asp 40 45 50 ACC TTGCCT TTC TTA AAG TGC TAT TGC TCA GGA CAC TGC CCA GAT GAT 426 Thr Leu ProPhe Leu Lys Cys Tyr Cys Ser Gly His Cys Pro Asp Asp 55 60 65 70 GCT ATTAAT AAC ACA TGC ATA ACT AAT GGC CAT TGC TTT GCC ATT ATA 474 Ala Ile AsnAsn Thr Cys Ile Thr Asn Gly His Cys Phe Ala Ile Ile 75 80 85 GAA GAA GATGAT CAG GGA GAA ACC ACA TTA ACT TCT GGG TGT ATG AAG 522 Glu Glu Asp AspGln Gly Glu Thr Thr Leu Thr Ser Gly Cys Met Lys 90 95 100 TAT GAA GGCTCT GAT TTT CAA TGC AAG GAT TCA CCG AAA GCC CAG CTA 570 Tyr Glu Gly SerAsp Phe Gln Cys Lys Asp Ser Pro Lys Ala Gln Leu 105 110 115 CGC AGG ACAATA GAA TGT TGT CGG ACC AAT TTG TGC AAC CAG TAT TTG 618 Arg Arg Thr IleGlu Cys Cys Arg Thr Asn Leu Cys Asn Gln Tyr Leu 120 125 130 CAG CCT ACACTG CCC CCT GTT GTT ATA GGT CCG TTC TTT GAT GGC AGC 666 Gln Pro Thr LeuPro Pro Val Val Ile Gly Pro Phe Phe Asp Gly Ser 135 140 145 150 ATC CGATGG CTG GTT GTG CTC ATT TCC ATG GCT GTC TGT ATA GTT GCT 714 Ile Arg TrpLeu Val Val Leu Ile Ser Met Ala Val Cys Ile Val Ala 155 160 165 ATG ATCATC TTC TCC AGC TGC TTT TGC TAT AAG CAT TAT TGT AAG AGT 762 Met Ile IlePhe Ser Ser Cys Phe Cys Tyr Lys His Tyr Cys Lys Ser 170 175 180 ATC TCAAGC AGG GGT CGT TAC AAC CGT GAT TTG GAA CAG GAT GAA GCA 810 Ile Ser SerArg Gly Arg Tyr Asn Arg Asp Leu Glu Gln Asp Glu Ala 185 190 195 TTT ATTCCA GTA GGA GAA TCA TTG AAA GAC CTG ATT GAC CAG TCC CAA 858 Phe Ile ProVal Gly Glu Ser Leu Lys Asp Leu Ile Asp Gln Ser Gln 200 205 210 AGC TCTGGG AGT GGA TCT GGA TTG CCT TTA TTG GTT CAG CGA ACT ATT 906 Ser Ser GlySer Gly Ser Gly Leu Pro Leu Leu Val Gln Arg Thr Ile 215 220 225 230 GCCAAA CAG ATT CAG ATG GTT CGG CAG GTT GGT AAA GGC CGC TAT GGA 954 Ala LysGln Ile Gln Met Val Arg Gln Val Gly Lys Gly Arg Tyr Gly 235 240 245 GAAGTA TGG ATG GGT AAA TGG CGT GGT GAA AAA GTG GCT GTC AAA GTG 1002 Glu ValTrp Met Gly Lys Trp Arg Gly Glu Lys Val Ala Val Lys Val 250 255 260 TTTTTT ACC ACT GAA GAA GCT AGC TGG TTT AGA GAA ACA GAA ATC TAC 1050 Phe PheThr Thr Glu Glu Ala Ser Trp Phe Arg Glu Thr Glu Ile Tyr 265 270 275 CAGACG GTG TTA ATG CGT CAT GAA AAT ATA CTT GGT TTT ATA GCT GCA 1098 Gln ThrVal Leu Met Arg His Glu Asn Ile Leu Gly Phe Ile Ala Ala 280 285 290 GACATT AAA GGC ACT GGT TCC TGG ACT CAG CTG TAT TTG ATT ACT GAT 1146 Asp IleLys Gly Thr Gly Ser Trp Thr Gln Leu Tyr Leu Ile Thr Asp 295 300 305 310TAC CAT GAA AAT GGA TCT CTC TAT GAC TTC CTG AAA TGT GCC ACA CTA 1194 TyrHis Glu Asn Gly Ser Leu Tyr Asp Phe Leu Lys Cys Ala Thr Leu 315 320 325GAC ACC AGA GCC CTA CTC AAG TTA GCT TAT TCT GCT GCT TGT GGT CTG 1242 AspThr Arg Ala Leu Leu Lys Leu Ala Tyr Ser Ala Ala Cys Gly Leu 330 335 340TGC CAC CTC CAC ACA GAA ATT TAT GGT ACC CAA GGG AAG CCT GCA ATT 1290 CysHis Leu His Thr Glu Ile Tyr Gly Thr Gln Gly Lys Pro Ala Ile 345 350 355GCT CAT CGA GAC CTG AAG AGC AAA AAC ATC CTT ATT AAG AAA AAT GGA 1338 AlaHis Arg Asp Leu Lys Ser Lys Asn Ile Leu Ile Lys Lys Asn Gly 360 365 370AGT TGC TGT ATT GCT GAC CTG GGC CTA GCT GTT AAA TTC AAC AGT GAT 1386 SerCys Cys Ile Ala Asp Leu Gly Leu Ala Val Lys Phe Asn Ser Asp 375 380 385390 ACA AAT GAA GTT GAC ATA CCC TTG AAT ACC AGG GTG GGC ACC AAG CGG 1434Thr Asn Glu Val Asp Ile Pro Leu Asn Thr Arg Val Gly Thr Lys Arg 395 400405 TAC ATG GCT CCA GAA GTG CTG GAT GAA AGC CTG AAT AAA AAC CAT TTC 1482Tyr Met Ala Pro Glu Val Leu Asp Glu Ser Leu Asn Lys Asn His Phe 410 415420 CAG CCC TAC ATC ATG GCT GAC ATC TAT AGC TTT GGT TTG ATC ATT TGG 1530Gln Pro Tyr Ile Met Ala Asp Ile Tyr Ser Phe Gly Leu Ile Ile Trp 425 430435 GAA ATG GCT CGT CGT TGT ATT ACA GGA GGA ATC GTG GAG GAA TAT CAA 1578Glu Met Ala Arg Arg Cys Ile Thr Gly Gly Ile Val Glu Glu Tyr Gln 440 445450 TTA CCA TAT TAC AAC ATG GTG CCC AGT GAC CCA TCC TAT GAG GAC ATG 1626Leu Pro Tyr Tyr Asn Met Val Pro Ser Asp Pro Ser Tyr Glu Asp Met 455 460465 470 CGT GAG GTT GTG TGT GTG AAA CGC TTG CGG CCA ATC GTG TCT AAC CGC1674 Arg Glu Val Val Cys Val Lys Arg Leu Arg Pro Ile Val Ser Asn Arg 475480 485 TGG AAC AGC GAT GAA TGT CTT CGA GCA GTT TTG AAG CTA ATG TCA GAA1722 Trp Asn Ser Asp Glu Cys Leu Arg Ala Val Leu Lys Leu Met Ser Glu 490495 500 TGT TGG GCC CAT AAT CCA GCC TCC AGA CTC ACA GCT TTG AGA ATC AAG1770 Cys Trp Ala His Asn Pro Ala Ser Arg Leu Thr Ala Leu Arg Ile Lys 505510 515 AAG ACA CTT GCA AAA ATG GTT GAA TCC CAG GAT GTA AAG ATT 1812 LysThr Leu Ala Lys Met Val Glu Ser Gln Asp Val Lys Ile 520 525 530TGACAATTAA ACAATTTTGA GGGAGAATTT AGACTGCAAG AACTTCTTCA CCCAAGGAAT 1872GGGTGGGATT AGCATGGAAT AGGATGTTGA CTTGGTTTCC AGACTCCTTC CTCTACATCT 1932TCACAGGCTG CTAACAGTAA ACCTTACCGT ACTCTACAGA ATACAAGATT GGAACTTGGA 1992ACTTCAAACA TGTCATTCTT TATATATGAC AGCTTTGTTT TAATGTGGGG TTTTTTTGTT 2052TGCTTTTTTT GTTTTGTT 2070 532 amino acids amino acid linear protein notprovided 14 Met Thr Gln Leu Tyr Thr Tyr Ile Arg Leu Leu Gly Ala Cys LeuPhe 1 5 10 15 Ile Ile Ser His Val Gln Gly Gln Asn Leu Asp Ser Met LeuHis Gly 20 25 30 Thr Gly Met Lys Ser Asp Leu Asp Gln Lys Lys Pro Glu AsnGly Val 35 40 45 Thr Leu Ala Pro Glu Asp Thr Leu Pro Phe Leu Lys Cys TyrCys Ser 50 55 60 Gly His Cys Pro Asp Asp Ala Ile Asn Asn Thr Cys Ile ThrAsn Gly 65 70 75 80 His Cys Phe Ala Ile Ile Glu Glu Asp Asp Gln Gly GluThr Thr Leu 85 90 95 Thr Ser Gly Cys Met Lys Tyr Glu Gly Ser Asp Phe GlnCys Lys Asp 100 105 110 Ser Pro Lys Ala Gln Leu Arg Arg Thr Ile Glu CysCys Arg Thr Asn 115 120 125 Leu Cys Asn Gln Tyr Leu Gln Pro Thr Leu ProPro Val Val Ile Gly 130 135 140 Pro Phe Phe Asp Gly Ser Ile Arg Trp LeuVal Val Leu Ile Ser Met 145 150 155 160 Ala Val Cys Ile Val Ala Met IleIle Phe Ser Ser Cys Phe Cys Tyr 165 170 175 Lys His Tyr Cys Lys Ser IleSer Ser Arg Gly Arg Tyr Asn Arg Asp 180 185 190 Leu Glu Gln Asp Glu AlaPhe Ile Pro Val Gly Glu Ser Leu Lys Asp 195 200 205 Leu Ile Asp Gln SerGln Ser Ser Gly Ser Gly Ser Gly Leu Pro Leu 210 215 220 Leu Val Gln ArgThr Ile Ala Lys Gln Ile Gln Met Val Arg Gln Val 225 230 235 240 Gly LysGly Arg Tyr Gly Glu Val Trp Met Gly Lys Trp Arg Gly Glu 245 250 255 LysVal Ala Val Lys Val Phe Phe Thr Thr Glu Glu Ala Ser Trp Phe 260 265 270Arg Glu Thr Glu Ile Tyr Gln Thr Val Leu Met Arg His Glu Asn Ile 275 280285 Leu Gly Phe Ile Ala Ala Asp Ile Lys Gly Thr Gly Ser Trp Thr Gln 290295 300 Leu Tyr Leu Ile Thr Asp Tyr His Glu Asn Gly Ser Leu Tyr Asp Phe305 310 315 320 Leu Lys Cys Ala Thr Leu Asp Thr Arg Ala Leu Leu Lys LeuAla Tyr 325 330 335 Ser Ala Ala Cys Gly Leu Cys His Leu His Thr Glu IleTyr Gly Thr 340 345 350 Gln Gly Lys Pro Ala Ile Ala His Arg Asp Leu LysSer Lys Asn Ile 355 360 365 Leu Ile Lys Lys Asn Gly Ser Cys Cys Ile AlaAsp Leu Gly Leu Ala 370 375 380 Val Lys Phe Asn Ser Asp Thr Asn Glu ValAsp Ile Pro Leu Asn Thr 385 390 395 400 Arg Val Gly Thr Lys Arg Tyr MetAla Pro Glu Val Leu Asp Glu Ser 405 410 415 Leu Asn Lys Asn His Phe GlnPro Tyr Ile Met Ala Asp Ile Tyr Ser 420 425 430 Phe Gly Leu Ile Ile TrpGlu Met Ala Arg Arg Cys Ile Thr Gly Gly 435 440 445 Ile Val Glu Glu TyrGln Leu Pro Tyr Tyr Asn Met Val Pro Ser Asp 450 455 460 Pro Ser Tyr GluAsp Met Arg Glu Val Val Cys Val Lys Arg Leu Arg 465 470 475 480 Pro IleVal Ser Asn Arg Trp Asn Ser Asp Glu Cys Leu Arg Ala Val 485 490 495 LeuLys Leu Met Ser Glu Cys Trp Ala His Asn Pro Ala Ser Arg Leu 500 505 510Thr Ala Leu Arg Ile Lys Lys Thr Leu Ala Lys Met Val Glu Ser Gln 515 520525 Asp Val Lys Ile 530 2160 base pairs nucleic acid unknown linear cDNANO NO internal Mouse CDS 10..1524 15 CGCGGTTAC ATG GCG GAG TCG GCC GGAGCC TCC TCC TTC TTC CCC CTT 48 Met Ala Glu Ser Ala Gly Ala Ser Ser PhePhe Pro Leu 1 5 10 GTT GTC CTC CTG CTC GCC GGC AGC GGC GGG TCC GGG CCCCGG GGG ATC 96 Val Val Leu Leu Leu Ala Gly Ser Gly Gly Ser Gly Pro ArgGly Ile 15 20 25 CAG GCT CTG CTG TGT GCG TGC ACC AGC TGC CTA CAG ACC AACTAC ACC 144 Gln Ala Leu Leu Cys Ala Cys Thr Ser Cys Leu Gln Thr Asn TyrThr 30 35 40 45 TGT GAG ACA GAT GGG GCT TGC ATG GTC TCC ATC TTT AAC CTGGAT GGC 192 Cys Glu Thr Asp Gly Ala Cys Met Val Ser Ile Phe Asn Leu AspGly 50 55 60 GTG GAG CAC CAT GTA CGT ACC TGC ATC CCC AAG GTG GAG CTG GTTCCT 240 Val Glu His His Val Arg Thr Cys Ile Pro Lys Val Glu Leu Val Pro65 70 75 GCT GGA AAG CCC TTC TAC TGC CTG AGT TCA GAG GAT CTG CGC AAC ACA288 Ala Gly Lys Pro Phe Tyr Cys Leu Ser Ser Glu Asp Leu Arg Asn Thr 8085 90 CAC TGC TGC TAT ATT GAC TTC TGC AAC AAG ATT GAC CTC AGG GTC CCC336 His Cys Cys Tyr Ile Asp Phe Cys Asn Lys Ile Asp Leu Arg Val Pro 95100 105 AGC GGA CAC CTC AAG GAG CCT GCG CAC CCC TCC ATG TGG GGC CCT GTG384 Ser Gly His Leu Lys Glu Pro Ala His Pro Ser Met Trp Gly Pro Val 110115 120 125 GAG CTG GTC GGC ATC ATC GCC GGC CCC GTC TTC CTC CTC TTC CTTATC 432 Glu Leu Val Gly Ile Ile Ala Gly Pro Val Phe Leu Leu Phe Leu Ile130 135 140 ATT ATC ATC GTC TTC CTG GTC ATC AAC TAT CAC CAG CGT GTC TACCAT 480 Ile Ile Ile Val Phe Leu Val Ile Asn Tyr His Gln Arg Val Tyr His145 150 155 AAC CGC CAG AGG TTG GAC ATG GAG GAC CCC TCT TGC GAG ATG TGTCTC 528 Asn Arg Gln Arg Leu Asp Met Glu Asp Pro Ser Cys Glu Met Cys Leu160 165 170 TCC AAA GAC AAG ACG CTC CAG GAT CTC GTC TAC GAC CTC TCC ACGTCA 576 Ser Lys Asp Lys Thr Leu Gln Asp Leu Val Tyr Asp Leu Ser Thr Ser175 180 185 GGG TCT GGC TCA GGG TTA CCC CTT TTT GTC CAG CGC ACA GTG GCCCGA 624 Gly Ser Gly Ser Gly Leu Pro Leu Phe Val Gln Arg Thr Val Ala Arg190 195 200 205 ACC ATT GTT TTA CAA GAG ATT ATC GGC AAG GGC CGG TTC GGGGAA GTA 672 Thr Ile Val Leu Gln Glu Ile Ile Gly Lys Gly Arg Phe Gly GluVal 210 215 220 TGG CGT GGT CGC TGG AGG GGT GGT GAC GTG GCT GTG AAA ATCTTC TCT 720 Trp Arg Gly Arg Trp Arg Gly Gly Asp Val Ala Val Lys Ile PheSer 225 230 235 TCT CGT GAA GAA CGG TCT TGG TTC CGT GAA GCA GAG ATC TACCAG ACC 768 Ser Arg Glu Glu Arg Ser Trp Phe Arg Glu Ala Glu Ile Tyr GlnThr 240 245 250 GTC ATG CTG CGC CAT GAA AAC ATC CTT GGC TTT ATT GCT GCTGAC AAT 816 Val Met Leu Arg His Glu Asn Ile Leu Gly Phe Ile Ala Ala AspAsn 255 260 265 AAA GAT AAT GGC ACC TGG ACC CAG CTG TGG CTT GTC TCT GACTAT CAC 864 Lys Asp Asn Gly Thr Trp Thr Gln Leu Trp Leu Val Ser Asp TyrHis 270 275 280 285 GAG CAT GGC TCA CTG TTT GAT TAT CTG AAC CGC TAC ACAGTG ACC ATT 912 Glu His Gly Ser Leu Phe Asp Tyr Leu Asn Arg Tyr Thr ValThr Ile 290 295 300 GAG GGA ATG ATT AAG CTA GCC TTG TCT GCA GCC AGT GGTTTG GCA CAC 960 Glu Gly Met Ile Lys Leu Ala Leu Ser Ala Ala Ser Gly LeuAla His 305 310 315 CTG CAT ATG GAG ATT GTG GGC ACT CAA GGG AAG CCG GGAATT GCT CAT 1008 Leu His Met Glu Ile Val Gly Thr Gln Gly Lys Pro Gly IleAla His 320 325 330 CGA GAC TTG AAG TCA AAG AAC ATC CTG GTG AAA AAA AATGGC ATG TGT 1056 Arg Asp Leu Lys Ser Lys Asn Ile Leu Val Lys Lys Asn GlyMet Cys 335 340 345 GCC ATT GCA GAC CTG GGC CTG GCT GTC CGT CAT GAT GCGGTC ACT GAC 1104 Ala Ile Ala Asp Leu Gly Leu Ala Val Arg His Asp Ala ValThr Asp 350 355 360 365 ACC ATA GAC ATT GCT CCA AAT CAG AGG GTG GGG ACCAAA CGA TAC ATG 1152 Thr Ile Asp Ile Ala Pro Asn Gln Arg Val Gly Thr LysArg Tyr Met 370 375 380 GCT CCT GAA GTC CTT GAC GAG ACA ATC AAC ATG AAGCAC TTT GAC TCC 1200 Ala Pro Glu Val Leu Asp Glu Thr Ile Asn Met Lys HisPhe Asp Ser 385 390 395 TTC AAA TGT GCC GAC ATC TAT GCC CTC GGG CTT GTCTAC TGG GAG ATT 1248 Phe Lys Cys Ala Asp Ile Tyr Ala Leu Gly Leu Val TyrTrp Glu Ile 400 405 410 GCA CGA AGA TGC AAT TCT GGA GGA GTC CAT GAA GACTAT CAA CTG CCG 1296 Ala Arg Arg Cys Asn Ser Gly Gly Val His Glu Asp TyrGln Leu Pro 415 420 425 TAT TAC GAC TTA GTG CCC TCC GAC CCT TCC ATT GAGGAG ATG CGA AAG 1344 Tyr Tyr Asp Leu Val Pro Ser Asp Pro Ser Ile Glu GluMet Arg Lys 430 435 440 445 GTT GTA TGT GAC CAG AAG CTA CGG CCC AAT GTCCCC AAC TGG TGG CAG 1392 Val Val Cys Asp Gln Lys Leu Arg Pro Asn Val ProAsn Trp Trp Gln 450 455 460 AGT TAT GAG GCC TTG CGA GTG ATG GGA AAG ATGATG CGG GAG TGC TGG 1440 Ser Tyr Glu Ala Leu Arg Val Met Gly Lys Met MetArg Glu Cys Trp 465 470 475 TAC GCC AAT GGT GCT GCC CGT CTG ACA GCT CTGCGC ATC AAG AAG ACT 1488 Tyr Ala Asn Gly Ala Ala Arg Leu Thr Ala Leu ArgIle Lys Lys Thr 480 485 490 CTG TCC CAG CTA AGC GTG CAG GAA GAT GTG AAGATT TAAGCTGTTC 1534 Leu Ser Gln Leu Ser Val Gln Glu Asp Val Lys Ile 495500 505 CTCTGCCTAC ACAAAGAACC TGGGCAGTGA GGATGACTGC AGCCACCGTGCAAGCGTCGT 1594 GGAGGCCTAT CCTCTTGTTT CTGCCCGGCC CTCTGGCAGA GCCCTGGCCTGCAAGAGGGA 1654 CAGAGCCTGG GAGACGCGCG CACTCCCGTT GGGTTTGAGA CAGACACTTTTTATATTTAC 1714 CTCCTGATGG CATGGAGACC TGAGCAAATC ATGTAGTCAC TCAATGCCACAACTCAAACT 1774 GCTTCAGTGG GAAGTACAGA GACCCAGTGC ATTGCGTGTG CAGGAGCGTGAGGTGCTGGG 1834 CTCGCCAGGA GCGGCCCCCA TACCTTGTGG TCCACTGGGC TGCAGGTTTTCCTCCAGGGA 1894 CCAGTCAACT GGCATCAAGA TATTGAGAGG AACCGGAAGT TTCTCCCTCCTTCCCGTAGC 1954 AGTCCTGAGC CACACCATCC TTCTCATGGA CATCCGGAGG ACTGCCCCTAGAGACACAAC 2014 CTGCTGCCTG TCTGTCCAGC CAAGTGCGCA TGTGCCGAGG TGTGTCCCACATTGTGCCTG 2074 GTCTGTGCCA CGCCCGTGTG TGTGTGTGTG TGTGTGAGTG AGTGTGTGTGTGTACACTTA 2134 ACCTGCTTGA GCTTCTGTGC ATGTGT 2160 505 amino acids aminoacid linear protein not provided 16 Met Ala Glu Ser Ala Gly Ala Ser SerPhe Phe Pro Leu Val Val Leu 1 5 10 15 Leu Leu Ala Gly Ser Gly Gly SerGly Pro Arg Gly Ile Gln Ala Leu 20 25 30 Leu Cys Ala Cys Thr Ser Cys LeuGln Thr Asn Tyr Thr Cys Glu Thr 35 40 45 Asp Gly Ala Cys Met Val Ser IlePhe Asn Leu Asp Gly Val Glu His 50 55 60 His Val Arg Thr Cys Ile Pro LysVal Glu Leu Val Pro Ala Gly Lys 65 70 75 80 Pro Phe Tyr Cys Leu Ser SerGlu Asp Leu Arg Asn Thr His Cys Cys 85 90 95 Tyr Ile Asp Phe Cys Asn LysIle Asp Leu Arg Val Pro Ser Gly His 100 105 110 Leu Lys Glu Pro Ala HisPro Ser Met Trp Gly Pro Val Glu Leu Val 115 120 125 Gly Ile Ile Ala GlyPro Val Phe Leu Leu Phe Leu Ile Ile Ile Ile 130 135 140 Val Phe Leu ValIle Asn Tyr His Gln Arg Val Tyr His Asn Arg Gln 145 150 155 160 Arg LeuAsp Met Glu Asp Pro Ser Cys Glu Met Cys Leu Ser Lys Asp 165 170 175 LysThr Leu Gln Asp Leu Val Tyr Asp Leu Ser Thr Ser Gly Ser Gly 180 185 190Ser Gly Leu Pro Leu Phe Val Gln Arg Thr Val Ala Arg Thr Ile Val 195 200205 Leu Gln Glu Ile Ile Gly Lys Gly Arg Phe Gly Glu Val Trp Arg Gly 210215 220 Arg Trp Arg Gly Gly Asp Val Ala Val Lys Ile Phe Ser Ser Arg Glu225 230 235 240 Glu Arg Ser Trp Phe Arg Glu Ala Glu Ile Tyr Gln Thr ValMet Leu 245 250 255 Arg His Glu Asn Ile Leu Gly Phe Ile Ala Ala Asp AsnLys Asp Asn 260 265 270 Gly Thr Trp Thr Gln Leu Trp Leu Val Ser Asp TyrHis Glu His Gly 275 280 285 Ser Leu Phe Asp Tyr Leu Asn Arg Tyr Thr ValThr Ile Glu Gly Met 290 295 300 Ile Lys Leu Ala Leu Ser Ala Ala Ser GlyLeu Ala His Leu His Met 305 310 315 320 Glu Ile Val Gly Thr Gln Gly LysPro Gly Ile Ala His Arg Asp Leu 325 330 335 Lys Ser Lys Asn Ile Leu ValLys Lys Asn Gly Met Cys Ala Ile Ala 340 345 350 Asp Leu Gly Leu Ala ValArg His Asp Ala Val Thr Asp Thr Ile Asp 355 360 365 Ile Ala Pro Asn GlnArg Val Gly Thr Lys Arg Tyr Met Ala Pro Glu 370 375 380 Val Leu Asp GluThr Ile Asn Met Lys His Phe Asp Ser Phe Lys Cys 385 390 395 400 Ala AspIle Tyr Ala Leu Gly Leu Val Tyr Trp Glu Ile Ala Arg Arg 405 410 415 CysAsn Ser Gly Gly Val His Glu Asp Tyr Gln Leu Pro Tyr Tyr Asp 420 425 430Leu Val Pro Ser Asp Pro Ser Ile Glu Glu Met Arg Lys Val Val Cys 435 440445 Asp Gln Lys Leu Arg Pro Asn Val Pro Asn Trp Trp Gln Ser Tyr Glu 450455 460 Ala Leu Arg Val Met Gly Lys Met Met Arg Glu Cys Trp Tyr Ala Asn465 470 475 480 Gly Ala Ala Arg Leu Thr Ala Leu Arg Ile Lys Lys Thr LeuSer Gln 485 490 495 Leu Ser Val Gln Glu Asp Val Lys Ile 500 505 1952base pairs nucleic acid unknown unknown cDNA NO NO internal Mouse CDS187..1692 17 AAGCGGCGGC AGAAGTTGCC GGCGTGGTGC TCGTAGTGAG GGCGCGGAGGACCCGGGACC 60 TGGGAAGCGG CGGCGGGTTA ACTTCGGCTG AATCACAACC ATTTGGCGCTGAGCTATGAC 120 AAGAGAGCAA ACAAAAAGTT AAAGGAGCAA CCCGGCCATA AGTGAAGAGAGAAGTTTATT 180 GATAAC ATG CTC TTA CGA AGC TCT GGA AAA TTA AAT GTG GGCACC AAG 228 Met Leu Leu Arg Ser Ser Gly Lys Leu Asn Val Gly Thr Lys 1 510 AAG GAG GAT GGA GAG AGT ACA GCC CCC ACC CCT CGG CCC AAG ATC CTA 276Lys Glu Asp Gly Glu Ser Thr Ala Pro Thr Pro Arg Pro Lys Ile Leu 15 20 2530 CGT TGT AAA TGC CAC CAC CAC TGT CCG GAA GAC TCA GTC AAC AAT ATC 324Arg Cys Lys Cys His His His Cys Pro Glu Asp Ser Val Asn Asn Ile 35 40 45TGC AGC ACA GAT GGG TAC TGC TTC ACG ATG ATA GAA GAA GAT GAC TCT 372 CysSer Thr Asp Gly Tyr Cys Phe Thr Met Ile Glu Glu Asp Asp Ser 50 55 60 GGAATG CCT GTT GTC ACC TCT GGA TGT CTA GGA CTA GAA GGG TCA GAT 420 Gly MetPro Val Val Thr Ser Gly Cys Leu Gly Leu Glu Gly Ser Asp 65 70 75 TTT CAATGT CGT GAC ACT CCC ATT CCT CAT CAA AGA AGA TCA ATT GAA 468 Phe Gln CysArg Asp Thr Pro Ile Pro His Gln Arg Arg Ser Ile Glu 80 85 90 TGC TGC ACAGAA AGG AAT GAG TGT AAT AAA GAC CTC CAC CCC ACT CTG 516 Cys Cys Thr GluArg Asn Glu Cys Asn Lys Asp Leu His Pro Thr Leu 95 100 105 110 CCT CCTCTC AAG GAC AGA GAT TTT GTT GAT GGG CCC ATA CAC CAC AAG 564 Pro Pro LeuLys Asp Arg Asp Phe Val Asp Gly Pro Ile His His Lys 115 120 125 GCC TTGCTT ATC TCT GTG ACT GTC TGT AGT TTA CTC TTG GTC CTC ATT 612 Ala Leu LeuIle Ser Val Thr Val Cys Ser Leu Leu Leu Val Leu Ile 130 135 140 ATT TTATTC TGT TAC TTC AGG TAT AAA AGA CAA GAA GCC CGA CCT CGG 660 Ile Leu PheCys Tyr Phe Arg Tyr Lys Arg Gln Glu Ala Arg Pro Arg 145 150 155 TAC AGCATT GGG CTG GAG CAG GAC GAG ACA TAC ATT CCT CCT GGA GAG 708 Tyr Ser IleGly Leu Glu Gln Asp Glu Thr Tyr Ile Pro Pro Gly Glu 160 165 170 TCC CTGAGA GAC TTG ATC GAG CAG TCT CAG AGC TCG GGA AGT GGA TCA 756 Ser Leu ArgAsp Leu Ile Glu Gln Ser Gln Ser Ser Gly Ser Gly Ser 175 180 185 190 GGCCTC CCT CTG CTG GTC CAA AGG ACA ATA GCT AAG CAA ATT CAG ATG 804 Gly LeuPro Leu Leu Val Gln Arg Thr Ile Ala Lys Gln Ile Gln Met 195 200 205 GTGAAG CAG ATT GGA AAA GGC CGC TAT GGC GAG GTG TGG ATG GGA AAG 852 Val LysGln Ile Gly Lys Gly Arg Tyr Gly Glu Val Trp Met Gly Lys 210 215 220 TGGCGT GGA GAA AAG GTG GCT GTG AAA GTG TTC TTC ACC ACG GAG GAA 900 Trp ArgGly Glu Lys Val Ala Val Lys Val Phe Phe Thr Thr Glu Glu 225 230 235 GCCAGC TGG TTC CGA GAG ACT GAG ATA TAT CAG ACG GTC CTG ATG CGG 948 Ala SerTrp Phe Arg Glu Thr Glu Ile Tyr Gln Thr Val Leu Met Arg 240 245 250 CATGAG AAT ATT CTG GGG TTC ATT GCT GCA GAT ATC AAA GGG ACT GGG 996 His GluAsn Ile Leu Gly Phe Ile Ala Ala Asp Ile Lys Gly Thr Gly 255 260 265 270TCC TGG ACT CAG TTG TAC CTC ATC ACA GAC TAT CAT GAA AAC GGC TCC 1044 SerTrp Thr Gln Leu Tyr Leu Ile Thr Asp Tyr His Glu Asn Gly Ser 275 280 285CTT TAT GAC TAT CTG AAA TCC ACC ACC TTA GAC GCA AAG TCC ATG CTG 1092 LeuTyr Asp Tyr Leu Lys Ser Thr Thr Leu Asp Ala Lys Ser Met Leu 290 295 300AAG CTA GCC TAC TCC TCT GTC AGC GGC CTA TGC CAT TTA CAC ACG GAA 1140 LysLeu Ala Tyr Ser Ser Val Ser Gly Leu Cys His Leu His Thr Glu 305 310 315ATC TTT AGC ACT CAA GGC AAG CCA GCA ATC GCC CAT CGA GAC TTG AAA 1188 IlePhe Ser Thr Gln Gly Lys Pro Ala Ile Ala His Arg Asp Leu Lys 320 325 330AGT AAA AAC ATC CTG GTG AAG AAA AAT GGA ACT TGC TGC ATA GCA GAC 1236 SerLys Asn Ile Leu Val Lys Lys Asn Gly Thr Cys Cys Ile Ala Asp 335 340 345350 CTG GGC TTG GCT GTC AAG TTC ATT AGT GAC ACA AAT GAG GTT GAC ATC 1284Leu Gly Leu Ala Val Lys Phe Ile Ser Asp Thr Asn Glu Val Asp Ile 355 360365 CCA CCC AAC ACC CGG GTT GGC ACC AAG CGC TAT ATG CCT CCA GAA GTG 1332Pro Pro Asn Thr Arg Val Gly Thr Lys Arg Tyr Met Pro Pro Glu Val 370 375380 CTG GAC GAG AGC TTG AAT AGA AAC CAT TTC CAG TCC TAC ATT ATG GCT 1380Leu Asp Glu Ser Leu Asn Arg Asn His Phe Gln Ser Tyr Ile Met Ala 385 390395 GAC ATG TAC AGC TTT GGA CTC ATC CTC TGG GAG ATT GCA AGG AGA TGT 1428Asp Met Tyr Ser Phe Gly Leu Ile Leu Trp Glu Ile Ala Arg Arg Cys 400 405410 GTT TCT GGA GGT ATA GTG GAA GAA TAC CAG CTT CCC TAT CAC GAC CTG 1476Val Ser Gly Gly Ile Val Glu Glu Tyr Gln Leu Pro Tyr His Asp Leu 415 420425 430 GTG CCC AGT GAC CCT TCT TAT GAG GAC ATG AGA GAA ATT GTG TGC ATG1524 Val Pro Ser Asp Pro Ser Tyr Glu Asp Met Arg Glu Ile Val Cys Met 435440 445 AAG AAG TTA CGG CCT TCA TTC CCC AAT CGA TGG AGC AGT GAT GAG TGT1572 Lys Lys Leu Arg Pro Ser Phe Pro Asn Arg Trp Ser Ser Asp Glu Cys 450455 460 CTC AGG CAG ATG GGG AAG CTT ATG ACA GAG TGC TGG GCG CAG AAT CCT1620 Leu Arg Gln Met Gly Lys Leu Met Thr Glu Cys Trp Ala Gln Asn Pro 465470 475 GCC TCC AGG CTG ACG GCC CTG AGA GTT AAG AAA ACC CTT GCC AAA ATG1668 Ala Ser Arg Leu Thr Ala Leu Arg Val Lys Lys Thr Leu Ala Lys Met 480485 490 TCA GAG TCC CAG GAC ATT AAA CTC TGACGTCAGA TACTTGTGGA CAGAGCAAGA1722 Ser Glu Ser Gln Asp Ile Lys Leu 495 500 ATTTCACAGA AGCATCGTTAGCCCAAGCCT TGAACGTTAG CCTACTGCCC AGTGAGTTCA 1782 GACTTTCCTG GAAGAGAGCACGGTGGGCAG ACACAGAGGA ACCCAGAAAC ACGGATTCAT 1842 CATGGCTTTC TGAGGAGGAGAAACTGTTTG GGTAACTTGT TCAAGATATG ATGCATGTTG 1902 CTTTCTAAGA AAGCCCTGTATTTTGAATTA CCATTTTTTT ATAAAAAAAA 1952 502 amino acids amino acid linearprotein not provided 18 Met Leu Leu Arg Ser Ser Gly Lys Leu Asn Val GlyThr Lys Lys Glu 1 5 10 15 Asp Gly Glu Ser Thr Ala Pro Thr Pro Arg ProLys Ile Leu Arg Cys 20 25 30 Lys Cys His His His Cys Pro Glu Asp Ser ValAsn Asn Ile Cys Ser 35 40 45 Thr Asp Gly Tyr Cys Phe Thr Met Ile Glu GluAsp Asp Ser Gly Met 50 55 60 Pro Val Val Thr Ser Gly Cys Leu Gly Leu GluGly Ser Asp Phe Gln 65 70 75 80 Cys Arg Asp Thr Pro Ile Pro His Gln ArgArg Ser Ile Glu Cys Cys 85 90 95 Thr Glu Arg Asn Glu Cys Asn Lys Asp LeuHis Pro Thr Leu Pro Pro 100 105 110 Leu Lys Asp Arg Asp Phe Val Asp GlyPro Ile His His Lys Ala Leu 115 120 125 Leu Ile Ser Val Thr Val Cys SerLeu Leu Leu Val Leu Ile Ile Leu 130 135 140 Phe Cys Tyr Phe Arg Tyr LysArg Gln Glu Ala Arg Pro Arg Tyr Ser 145 150 155 160 Ile Gly Leu Glu GlnAsp Glu Thr Tyr Ile Pro Pro Gly Glu Ser Leu 165 170 175 Arg Asp Leu IleGlu Gln Ser Gln Ser Ser Gly Ser Gly Ser Gly Leu 180 185 190 Pro Leu LeuVal Gln Arg Thr Ile Ala Lys Gln Ile Gln Met Val Lys 195 200 205 Gln IleGly Lys Gly Arg Tyr Gly Glu Val Trp Met Gly Lys Trp Arg 210 215 220 GlyGlu Lys Val Ala Val Lys Val Phe Phe Thr Thr Glu Glu Ala Ser 225 230 235240 Trp Phe Arg Glu Thr Glu Ile Tyr Gln Thr Val Leu Met Arg His Glu 245250 255 Asn Ile Leu Gly Phe Ile Ala Ala Asp Ile Lys Gly Thr Gly Ser Trp260 265 270 Thr Gln Leu Tyr Leu Ile Thr Asp Tyr His Glu Asn Gly Ser LeuTyr 275 280 285 Asp Tyr Leu Lys Ser Thr Thr Leu Asp Ala Lys Ser Met LeuLys Leu 290 295 300 Ala Tyr Ser Ser Val Ser Gly Leu Cys His Leu His ThrGlu Ile Phe 305 310 315 320 Ser Thr Gln Gly Lys Pro Ala Ile Ala His ArgAsp Leu Lys Ser Lys 325 330 335 Asn Ile Leu Val Lys Lys Asn Gly Thr CysCys Ile Ala Asp Leu Gly 340 345 350 Leu Ala Val Lys Phe Ile Ser Asp ThrAsn Glu Val Asp Ile Pro Pro 355 360 365 Asn Thr Arg Val Gly Thr Lys ArgTyr Met Pro Pro Glu Val Leu Asp 370 375 380 Glu Ser Leu Asn Arg Asn HisPhe Gln Ser Tyr Ile Met Ala Asp Met 385 390 395 400 Tyr Ser Phe Gly LeuIle Leu Trp Glu Ile Ala Arg Arg Cys Val Ser 405 410 415 Gly Gly Ile ValGlu Glu Tyr Gln Leu Pro Tyr His Asp Leu Val Pro 420 425 430 Ser Asp ProSer Tyr Glu Asp Met Arg Glu Ile Val Cys Met Lys Lys 435 440 445 Leu ArgPro Ser Phe Pro Asn Arg Trp Ser Ser Asp Glu Cys Leu Arg 450 455 460 GlnMet Gly Lys Leu Met Thr Glu Cys Trp Ala Gln Asn Pro Ala Ser 465 470 475480 Arg Leu Thr Ala Leu Arg Val Lys Lys Thr Leu Ala Lys Met Ser Glu 485490 495 Ser Gln Asp Ile Lys Leu 500 28 base pairs nucleic acid singlelinear cDNA NO NO not provided 19 GCGGATCCTG TTGTGAAGGN AATATGTG 28 24base pairs nucleic acid single linear cDNA NO NO not provided 20GCGATCCGTC GCAGTCAAAA TTTT 24 26 base pairs nucleic acid single linearcDNA NO NO not provided 21 GCGGATCCGC GATATATTAA AAGCAA 26 20 base pairsnucleic acid single linear cDNA NO YES not provided 22 CGGAATTCTGGTGCCATATA 20 37 base pairs nucleic acid single linear cDNA NO NO notprovided 23 ATTCAAGGGC ACATCAACTT CATTTGTGTC ACTGTTG 37 26 base pairsnucleic acid single linear cDNA NO NO not provided 24 GCGGATCCACCATGGCGGAG TCGGCC 26 20 base pairs nucleic acid single linear cDNA NO NOnot provided 25 AACACCGGGC CGGCGATGAT 20 6 amino acids amino acid linearpeptide internal not provided 26 Gly Xaa Gly Xaa Xaa Gly 1 5 6 aminoacids amino acid linear peptide not provided 27 Asp Phe Lys Ser Arg Asn1 5 6 amino acids amino acid linear peptide not provided 28 Asp Leu LysSer Lys Asn 1 5 6 amino acids amino acid linear peptide not provided 29Gly Thr Lys Arg Tyr Met 1 5

What is claimed is:
 1. An isolated extrachromosomal nucleic acidmolecule, which encodes an ALK-1 protein having the amino acid sequenceset forth at SEQ ID NO:
 2. 2. An isolated extrachromosomal nucleic acidmolecule which encodes an ALK-1 protein having the amino acid sequenceset forth at SEQ ID NO:
 12. 3. The isolated extrachromosal nucleic acidmolecule of claim 1, having the nucleotide sequence set forth at SEQ IDNO:
 1. 4. The isolated extrachromosal nucleic acid molecule of claim 2,having the nucleotide sequence set forth at SEQ ID NO:
 11. 5. Expressionvector comprising the isolated extrachromosomal nucleic acid molecule ofclaim 1, operably linked to a promoter.
 6. Expression vector comprisingthe isolated extrachromosomal nucleic acid molecule of claim 2, operablylinked to a promoter.
 7. Prokaryotic cell or eukaryotic cell,transformed or transfected with the isolated extrachromosomal nucleicacid molecule of claim
 1. 8. Prokaryotic cell or eukaryotic cell,transformed or transfected with the isolated extrachromosomal nucleicacid molecule of claim
 2. 9. Prokaryotic cell or eukaryotic cell,transformed or transfected with the expression vector of claim
 5. 10.Prokaryotic cell or eukaryotic cell, transformed or transfected with theexpression vector of claim
 6. 11. The prokaryotic cell of claim 7,wherein said cell is E. coli.
 12. The prokaryotic cell of claim 8,wherein said cell is E. coli.
 13. The eukaryotic cell of claim 7,wherein said cell is S. cervisiae, PAE, COS or CHO.
 14. The eukaryoticcell of claim 8, wherein said cell is S. cervisiae, PAE, COS or CHO. 15.The prokaryotic cell of claim 9, wherein said cell is E coli.
 16. Theprokaryotic cell of claim 10, wherein said cell is E. coli.
 17. Theeukaryotic cell of claim 9, wherein said cell is S. cervisae, PAE COS orCHO.
 18. The eukaryotic cell of claim 10, wherein said cell is S.cervisiae, PAE, COS or CHO.
 19. An isolated ALK-1 protein which isencoded by the isolated nucleic acid molecule of claim
 1. 20. Anisolated ALK-1 protein which is encoded by the isolated nucleic acidmolecule of claim
 2. 21. The isolated ALK-1 protein, having the aminoacid sequence set forth in SEQ ID NO:
 2. 22. The isolated ALK-1 proteinhaving the amino acid sequence set forth in SEQ ID NO: 12.